C1

Calibration

Probability Calibration

Stated probabilities should match actual frequencies

5
Models Tested
0
Confirmatory
0.096
Mean Effect
0.248
Max Effect

Theoretical Context

Theoretical Anchor

Lichtenstein et al. (1982)

Normative Violation

Stated probabilities should match actual frequencies

Cross-Model Comparison

Effect sizes for Probability Calibration across all tested models

Google
Gemini 2.0 Flash

The Consistent Optimist

h = +0.248
Anthropic
Claude Sonnet 4.6

The Cautious Contrarian

h = +0.220
OpenAI
GPT-5.2

The Steady Traditionalist

h = -0.208
OpenAI
GPT-5.3 Instant

The Directive Optimist

h = +0.113
OpenAI
GPT-5.4 Thinking

The Deliberative Calibrator

h = +0.105

Statistical Details

Full results with confidence intervals and sample sizes

Modeln (A)n (B)Cohen's h95% CIStatus
Gemini 2.0 Flash5050+0.2483Exploratory
Claude Sonnet 4.65050+0.2195Exploratory
GPT-5.25050-0.2076Exploratory
GPT-5.3 Instant5050+0.1125Exploratory
GPT-5.4 Thinking5050+0.1051Exploratory