Read-only demo. Approve, reject, deploy, and iteration actions are disabled. Self-host from GitHub.
‹ Workflows

Review proposed changes to our union contracts before each negotiation round.

Open operator view ↗

Gated · contract-review

#
Case
Expected
Fold
1
clean-clause-step-2
Within-bounds clause under default seed.
target: is_problematic · seed 13 · step 2/80 · provenance: nl-gen
false
test
2
alt-seed-clean-clause
Within-bounds clause under alternate seed.
target: is_problematic · seed 42 · step 0/80 · provenance: nl-gen
false
train
3
clean-clause-step-0
First clause is within bounds; agent must not flag.
target: is_problematic · seed 13 · step 0/80 · provenance: nl-gen
false
train
4
seed-7-clean-clause-step2
Third seed second clean clause.
target: is_problematic · seed 7 · step 2/80 · provenance: nl-gen
false
train
5
overtime-carveout-flagged
First problematic clause under default seed — covers the overtime carve-out past-miss class.
target: is_problematic · seed 13 · step 3/80 · provenance: nl-gen
true
train
6
seed-7-clean-clause-step0
Third seed clean clause sanity check.
target: is_problematic · seed 7 · step 0/80 · provenance: nl-gen
false
train
7
clean-clause-step-1
Within-bounds clause; agent must not over-flag.
target: is_problematic · seed 13 · step 1/80 · provenance: nl-gen
false
train
8
alt-seed-clean-clause-step1
Second clean clause under alternate seed.
target: is_problematic · seed 42 · step 1/80 · provenance: nl-gen
false
train
9
alt-seed-problematic-clause-step6
Excessive-bound clause under alternate seed.
target: is_problematic · seed 42 · step 6/80 · provenance: nl-gen
true
test
10
grievance-precedent-flagged
Problematic clause that ties back to grievance-precedent review failure mode.
target: is_problematic · seed 13 · step 5/80 · provenance: nl-gen
true
train
11
notification-window-30-day-breach
Past-miss directly: notification window below the 60-day requirement is the canonical termination-clause violation the agent must catch.
target: is_problematic · seed 42 · step 2/80 · provenance: nl-gen
true
train
12
seed-7-problematic-step1
Third seed problematic clause sanity check.
target: is_problematic · seed 7 · step 1/80 · provenance: nl-gen
true
train