Read-only demo. Approve, reject, deploy, and iteration actions are disabled. Self-host from GitHub.
‹ Workflows

Forecast weekly demand at SKU-store level for the next four weeks.

Open operator view ↗

Gated · sku-demand-forecast-markdown-flagging

Next: generate eval cases

The improvement loop needs a test suite to score against. Generate eval cases from the workflow’s spec (~30–60s, 2 LLM calls), then run an iteration.

Agent anatomy

Single-agent loop, gated by the regression suite. Below: the skills the agent has loaded, the tools it can call, and who signs off on changes.

Skills active · 0
No skills bound to this workflow yet — generated on first run.
Tools available · 6
  • fetch_sales_history
    Pull historical weekly POS sales for a SKU-store pair.
    fetch_sales_history(sku_id: string, store_id: string, weeks_back: int) → sales_series: string
  • fetch_promotions
    Get planned promotions affecting a SKU over the forecast horizon.
    fetch_promotions(sku_id: string, horizon_weeks: int) → promotions: string
  • forecast_demand
    Produce a 4-week weekly demand forecast at SKU-store level, accounting for seasonality, promotions, and regional variance.
    forecast_demand(sku_id: string, store_id: string, horizon_weeks: int) → forecast: string
  • compute_markdown_risk
    Score the likelihood a SKU-store will need markdown within four weeks given inventory, forecast, and sell-through.
    compute_markdown_risk(sku_id: string, store_id: string) → markdown_probability: float, predicted_markdown_week: date, severity: category
  • get_regional_factor
    Return regional demand adjustment factor for a store's region and category.
    get_regional_factor(store_id: string, category: string) → regional_factor: float
  • raise_markdown_flag
    Submit a markdown flag for category planner review.
    raise_markdown_flag(sku_id: string, store_id: string, predicted_markdown_week: date, severity: category, rationale: string) → flag_id: string
Topology & review
  • Single-agent loop
    One agent reads its skills, calls tools, and proposes the next skill version. Regression gate runs every iteration. Phase-2 multi-agent is out of scope.
  • Reviewer · Category planner
    cadence: weekly
    Reviews markdown flags weekly and decides which to action.
  • Success · maximize markdown_flag_precision_recall
    Correctly forecast weekly SKU-store demand for the next four weeks and flag SKUs that genuinely need markdown within four weeks — avoiding both the Q4 Northeast cold-weather promo under-forecast and the slow-moving holiday over-forecast misses.
  • Environment
    5 entity types · 4 data sources · 4 generators · 1 personas · seasonality: Q4 holiday peak, cold-weather ramp, promotional events

Skills + tools are read live from the kernel. Open the trace inspector to watch one run end-to-end.