G3

Consistency

Context Sensitivity

Irrelevant context should not affect recommendations

4
Models Tested
2
Confirmatory
-0.420
Mean Effect
1.328
Max Effect

Theoretical Context

Theoretical Anchor

Consistency Baseline

Normative Violation

Irrelevant context should not affect recommendations

Cross-Model Comparison

Effect sizes for Context Sensitivity across all tested models

Anthropic
Claude Sonnet 4.6

The Cautious Contrarian

h = -1.328Confirmatory
OpenAI
GPT-5.3 Instant

The Directive Optimist

h = -0.707Confirmatory
OpenAI
GPT-5.2

The Steady Traditionalist

h = +0.403
Google
Gemini 2.0 Flash

The Consistent Optimist

h = -0.048

Statistical Details

Full results with confidence intervals and sample sizes

Modeln (A)n (B)Cohen's h95% CIStatus
Claude Sonnet 4.65050-1.3280Confirmatory
GPT-5.3 Instant5050-0.7072Confirmatory
GPT-5.25050+0.4031Exploratory
Gemini 2.0 Flash5050-0.0483Exploratory