Read-only demo. Approve, reject, deploy, and iteration actions are disabled. Self-host from GitHub.
‹ Workflows

Recalibrate probability-of-default models monthly using new portfolio performance data.

Open operator view ↗

Gated · pd-model-recalibration

Next: generate eval cases

The improvement loop needs a test suite to score against. Generate eval cases from the workflow’s spec (~30–60s, 2 LLM calls), then run an iteration.

Agent anatomy

Single-agent loop, gated by the regression suite. Below: the skills the agent has loaded, the tools it can call, and who signs off on changes.

Skills active · 0
No skills bound to this workflow yet — generated on first run.
Tools available · 6
  • load_portfolio_performance
    Pull the latest monthly portfolio performance snapshot with realized defaults, DPD, segment, and vintage tags.
    load_portfolio_performance(as_of_date: date, segments: string?) → loans: string, row_count: int
  • compute_drift
    Compare predicted PD against realized default rates across segment, vintage, and macro factor slices; flag slices exceeding tolerance.
    compute_drift(as_of_date: date, tolerance: float) → findings: string, max_drift: float
  • detect_concentration_shift
    Identify sector or vintage concentration shifts in the portfolio relative to prior periods.
    detect_concentration_shift(as_of_date: date, lookback_months: int) → shifts: string, flagged: bool
  • propose_pd_weight_adjustment
    Propose adjusted PD weights for slices whose drift exceeds tolerance.
    propose_pd_weight_adjustment(finding_id: string, target_drift: float) → proposal_id: string, proposed_weight: float, current_weight: float
  • draft_rationale
    Generate a written rationale explaining the drift evidence and the proposed weight change for CRO review.
    draft_rationale(proposal_id: string) → rationale_text: string
  • submit_for_cro_review
    Submit the recalibration proposal and rationale to the CRO approval queue.
    submit_for_cro_review(proposal_id: string) → submission_id: string, status: category
Topology & review
  • Single-agent loop
    One agent reads its skills, calls tools, and proposes the next skill version. Regression gate runs every iteration. Phase-2 multi-agent is out of scope.
  • Reviewer · Chief Risk Officer (CRO)
    cadence: monthly, on each proposed recalibration
    Reviews drift findings, proposed PD weight adjustments, and the written rationale before sign-off.
  • Success · maximize pd_calibration_quality
    Proposed PD weight adjustments, when applied, reduce drift between predicted PD and realized defaults across segment, vintage, and macro factor slices, with rationales that the CRO accepts on first review.
  • Environment
    4 entity types · 3 data sources · 3 generators · 2 personas · seasonality: monthly recalibration cadence, quarterly vintage review

Skills + tools are read live from the kernel. Open the trace inspector to watch one run end-to-end.