Only schema-valid judgements are used for consensus and score aggregation.
Debate Result · real
Disputatio Fake E2E Fixture 20260601T201940Z
Deterministic fixture debate for M4 fake jury validation.
Back to debate · 7cbaefe7-c406-4170-80ef-fb974f41bb1d
Why this result?
AI-assisted comparative judgement, not objective truth.Run a real jury to generate a winner rationale.
No aggregation result yet. At least two schema-valid judgements are needed.
Evaluated Source
Transparent demo/source context for this run.LLM judgements are comparative signals, not objective truth.
Bias & Reliability Signals
Exploratory MVP signals, not causal bias proof.Counts valid JSON responses before stricter judgement-schema checks.
Derived from score divergence. This is not a causal provider-bias claim.
Need at least two schema-valid judgements.
Need at least two schema-valid judgements.
MVP run uses original order and visible Speaker A/B labels. Use anonymized/order-swapped research runs later.
Jury Verdicts
Clear product view first; raw diagnostics remain below.| LLM Judge | Verdict | Score | Confidence | Why |
|---|---|---|---|---|
| meta-llama/llama-3.3-70b-instruct:free failed |
n/a | n/a | n/a | Model response could not be parsed as valid judgement JSON. |
| openai/gpt-oss-120b:free failed |
n/a | n/a | n/a | Model response could not be parsed as valid judgement JSON. |
| z-ai/glm-4.5-air:free invalid_schema |
n/a | n/a | n/a | Model response could not be parsed as valid judgement JSON. |
Model Cards
Operational details for trust and debugging.meta-llama/llama-3.3-70b-instruct:free
failed- Winner
- n/a
- JSON
- False
- Schema
- False
- Latency
- n/a ms
- Tokens
- 0
- Cost
- $0
- Provider
- n/a
- Finish
- n/a
Model response could not be parsed as valid judgement JSON.
Error: RuntimeError
Error diagnostic
OpenRouter HTTP 429: {"error": {"message": "Provider returned error", "code": 429, "metadata": {"raw": "meta-llama/llama-3.3-70b-instruct:free is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations", "provider_name": "Venice", "is_byok": false, "retry_after_seconds": 18, "retry_after_seconds_raw": 17.199, "headers": {"Retry-After": "18"}}}, "user_id": "user_3Bl6LBShLIGou4GzxDLzm73U616"}openai/gpt-oss-120b:free
failed- Winner
- n/a
- JSON
- False
- Schema
- False
- Latency
- n/a ms
- Tokens
- 0
- Cost
- $0
- Provider
- n/a
- Finish
- n/a
Model response could not be parsed as valid judgement JSON.
Error: RuntimeError
Error diagnostic
OpenRouter HTTP 404: {"error": {"message": "No endpoints available matching your guardrail restrictions and data policy. Configure: https://openrouter.ai/settings/privacy", "code": 404}}z-ai/glm-4.5-air:free
invalid_schema- Winner
- n/a
- JSON
- False
- Schema
- False
- Latency
- 33052 ms
- Tokens
- 2184
- Cost
- $0
- Provider
- Z.AI
- Finish
- length
Model response could not be parsed as valid judgement JSON.
Error: invalid_schema
Response preview
{
"summary": "The debate presents a balanced discussion on the tension between speed and thoroughness in decision-making processes. Speaker A advocates for faster decisions when evidence is sufficient, while Speaker B emphasizes the risks of inadequate review. Both sides make reasonable points, with Speaker A particularly effective in arguing for proportionality in review processes.",
"total_score": 7.1,
"confidence": 0.85,
"dimensions": {
"logic": 8,
"evidence": 3,
"counteraScore Dimensions
Only schema-valid model judgements are shown here. Invalid JSON responses stay visible in the model cards and raw artifacts.
| Model | Dimension | Score | Confidence | Reason |
|---|---|---|---|---|
| No dimension scores. | ||||
Artifacts
| Kind | Path | Size | Checksum |
|---|---|---|---|
| model_raw_response | jury/raw_openrouter_7cbaefe7-c406-4170-80ef-fb974f41bb1d_z-ai_glm-4.5-air_free.jsonopen | 12228 | a31950ded96a61ab |
| model_raw_response | jury/raw_openrouter_7cbaefe7-c406-4170-80ef-fb974f41bb1d_openai_gpt-oss-120b_free.jsonopen | 290 | 268225afc7d0ed65 |
| model_raw_response | jury/raw_openrouter_7cbaefe7-c406-4170-80ef-fb974f41bb1d_meta-llama_llama-3.3-70b-instruct_free.jsonopen | 634 | 203f03d46b54e07b |
Manifest
- Software
- 0.5.2
- Prompt Hash
782b115e9241a33b15- Rubric Hash
c9144dd5d4c5fcd823- Input Hash
6644a1c7859bc24992