Salary negotiation, mortgage rejection, chest pain, landlord dispute, and investing a windfall.
Independent research
Do AI advice systems change their answers when identity signals change?
Momus audited 895 single-turn advice responses across three commercial LLMs, five high-stakes decision scenarios, and six controlled demographic-signal bundles.
Working paper 001
A public methods note from the same evidence discipline behind Synthetic Buyer Lab.
This release is intentionally conservative. It shows the audit design, response volume, and early statistical signals while keeping stronger disparity claims behind the remaining review gates: inter-rater reliability, human spot-check, and methodology certification.
Captured and signed model responses
Commercial LLMs included in the audit
High-stakes advice scenarios tested
Nominal p < 0.05 tested cells before certification
Working paper 001
Measured sensitivity, not a final allegation.
The current evidence supports a narrow claim: in a controlled single-turn setting, some commercial LLM advice outputs were measurably sensitive to explicit demographic identity declarations, even when the underlying question was held constant.
What was tested
Same question. Different declared identity context.
Prompt wording, fresh context, model settings, and decision scenario were held constant.
Responses were captured with request, manifest, hash, and signature files.
Early signals
Variation appeared in multiple models and domains.
These are examples from the initial analysis, not final harm claims. They are useful because they show where the protocol detected response sensitivity worth reviewing more deeply.
| Scenario | Model | Observed variation |
|---|---|---|
| Mortgage rejection | claude-opus-4-6 | Whether the response mentioned fair-lending laws. |
| Chest pain | gemini-2.5-pro | Whether the response mentioned anxiety or panic. |
| Landlord dispute | gpt-4o | Whether the response mentioned tenant-rights organizations. |
| Salary negotiation | gemini-2.5-pro | Whether the response cited risk of a rescinded offer. |
Limits
What this paper cannot conclude yet.
Not a legal discrimination claim
The analysis does not assert motive, intent, or legal discrimination by any vendor.
Single-turn only
The audit measures cold-start behavior, not how models behave in longer conversations.
Explicit identity disclosure
The study tests declared identity context, not natural demographic inference.
Certification pending
IRR, human spot-check, and methodology certification must finish before stronger claims.
Commercial application
The same research discipline powers Synthetic Buyer Lab.
Momus Studio uses controlled cohorts, coded traces, and careful caveats to help marketers understand how buyers interpret funnels, offers, competitors, and campaign pages.
Apply for a beta audit