‹ WorkflowsOpen operator view ↗
Recalibrate credit lines monthly across our 22,000-SMB portfolio.
Gated · credit-risk
Iterations
1
· first run
Latest val_score
33.3%
Lift vs baseline
+0.0pp
Pending proposals
0
· 12 cases in suite
Lift curve
1 iteration · val_score over time
val_scoreBaseline
| Case | Predicted | Expected | Result | Agent rationale |
|---|---|---|---|---|
| default-medium-trajectory | ✗ | ✓ | fail | APP_00009 has a moderate credit score (659), very low DTI ratio (0.137), and loan amount well below income—all factors associated with lowe… |
| stale-dpd-band-non-default | ✓ | ✗ | fail | Despite favorable financial metrics (high credit score of 718, low DTI of 0.156, high income), the strong base rate of defaults (11/11 in t… |
| alt-seed-default-applicant-step4 | ✗ | ✓ | fail | APP_00005 has the highest income (104,777), a moderate credit score (694), a reasonable DTI ratio (0.627), and a loan amount proportional t… |
| seed-99-non-default-applicant | ✓ | ✗ | fail | APP_00002 has a significantly elevated debt-to-income ratio (0.636 vs. 0.226) despite a moderately higher credit score, indicating heighten… |
| default-mid-trajectory | ✗ | ✓ | fail | APP_00003 has a strong credit score (742), low DTI ratio (0.131), and reasonable loan-to-income ratio, all indicators of low default risk. |
| rate-shock-line-too-high | ✗ | ✓ | fail | The applicant shows low credit risk: credit score of 671 is acceptable, DTI ratio of 0.379 is reasonable (below 0.43 threshold), and loan a… |
| seed-99-default-applicant | ✗ | ✓ | fail | Credit score of 646 is marginal but acceptable, DTI ratio of 0.226 is healthy (well below 0.43 threshold), income-to-loan ratio is strong a… |
| hospitality-concentration-default | ✗ | ✓ | fail | Credit score of 664 is fair-to-good, DTI of 0.635 is manageable, and loan-to-income ratio (63%) is reasonable, suggesting low default risk. |
| non-default-low-dti | ✗ | ✗ | pass | APP_00012 has a credit score of 682, moderate DTI of 0.424, and healthy income-to-loan ratio, positioning it favorably relative to the defa… |
| non-default-clean-applicant | ✗ | ✗ | pass | APP_00002 has a low DTI ratio (0.14), reasonable credit score (624), and small loan-to-income ratio, suggesting lower default risk than APP… |
| seed-99-non-default-step4 | ✗ | ✗ | pass | APP_00005 has a low DTI ratio (0.17), high income (96478), and conservative loan amount (16420), all protective factors; though credit scor… |
| alt-seed-low-risk-applicant | ✗ | ✗ | pass | APP_00003 has a lower credit score (666) than the first two defaulters, but shows much stronger fundamentals: DTI of 0.286 (vs 0.379 and 2.… |
default-medium-trajectory
stale-dpd-band-non-default
alt-seed-default-applicant-step4
seed-99-non-default-applicant
default-mid-trajectory
Failed · test fold
2default-medium-trajectory
APP_00009 has a moderate credit score (659), very low DTI ratio (0.137), and loan amount well below income—al…
seed-99-non-default-applicant
APP_00002 has a significantly elevated debt-to-income ratio (0.636 vs. 0.226) despite a moderately higher cre…
Failed · train
6stale-dpd-band-non-default
Despite favorable financial metrics (high credit score of 718, low DTI of 0.156, high income), the strong bas…
alt-seed-default-applicant-step4
APP_00005 has the highest income (104,777), a moderate credit score (694), a reasonable DTI ratio (0.627), an…
default-mid-trajectory
APP_00003 has a strong credit score (742), low DTI ratio (0.131), and reasonable loan-to-income ratio, all in…
rate-shock-line-too-high
The applicant shows low credit risk: credit score of 671 is acceptable, DTI ratio of 0.379 is reasonable (bel…
seed-99-default-applicant
Credit score of 646 is marginal but acceptable, DTI ratio of 0.226 is healthy (well below 0.43 threshold), in…
hospitality-concentration-default
Credit score of 664 is fair-to-good, DTI of 0.635 is manageable, and loan-to-income ratio (63%) is reasonable…
Passed
4non-default-low-dti
APP_00012 has a credit score of 682, moderate DTI of 0.424, and healthy income-to-loan ratio, positioning it…
non-default-clean-applicant
APP_00002 has a low DTI ratio (0.14), reasonable credit score (624), and small loan-to-income ratio, suggesti…
seed-99-non-default-step4
APP_00005 has a low DTI ratio (0.17), high income (96478), and conservative loan amount (16420), all protecti…
alt-seed-low-risk-applicant
APP_00003 has a lower credit score (666) than the first two defaulters, but shows much stronger fundamentals:…
Iterations · 1
Iterval_scoreBest everStateApproved?Ended
#00.3330.333gate-blocked-no-improvement2026-05-19 04:29Agent anatomy
Single-agent loop, gated by the regression suite. Below: the skills the agent has loaded, the tools it can call, and who signs off on changes.
Skills active · 0
No skills bound to this workflow yet — generated on first run.
Tools available · 4
- propose_line_changeRecommends a new credit limit and action.
propose_line_change(account_id: string, proposed_limit: float, action: category, rationale: string) - query_repayment_historyReturns weekly repayment + DPD history for an account.
query_repayment_history(account_id: string, months_back: int) → repayment_series: string - fetch_sector_exposureAggregate exposure for the account's sector.
fetch_sector_exposure(sector: category) → exposure_pct: float - fetch_dnb_signalExternal credit signal from Dun & Bradstreet.
fetch_dnb_signal(account_id: string) → signal_score: float
Topology & review
- Single-agent loopOne agent reads its skills, calls tools, and proposes the next skill version. Regression gate runs every iteration. Phase-2 multi-agent is out of scope.
- Reviewer · Chief Risk Officercadence: weeklyApproves or rejects proposed line changes.
- Success · maximize line_recalibration_compositeA recommendation is correct if the account does not breach the new limit within 90 days and does not default within 180 days. Composite of breach-rate, default-rate, and over-tightening false-positive rate.
- Environment2 entity types · 2 data sources · 2 generators · 2 personas · seasonality: quarterly, rate-cycle
Skills + tools are read live from the kernel. Open the trace inspector to watch one run end-to-end.