Environment Status

Healthy
CheckStatusDetail
dotenv_loaded OK Loaded
voyage_api_key Missing

Cost Summary

No completed runs found.

Lever: content_scope

config/levers/content_scope.json
KeyValueUsed where
target_curriculum Kurikulum Merdeka Fallback curriculum label written into manifests when the preflight confirmation form doesn't override. User-confirmed CONFIRMED_INPUTS.json values always win at runtime.
grade_band 7-9 Fallback grade band string. Feeds KB stage filtering via kb.py grade normalization.
geography ID Fallback ISO country code. Feeds scorer curriculum_context and anti-bias guidance.
language en Fallback UI / output language code. Non-English source texts are translated during textbook extraction to match.
content_type_sequence Learn, Practice, Challenge

Lever: infrastructure

config/levers/infrastructure.json
KeyValueUsed where
model claude-sonnet-4-6 Primary Sonnet model. Referenced as `model_key: "model"` by every skill doing heavy structured work — LO derivation, research, blueprint build, scorers, module sequence, mastery. Changing this ID promotes/demotes every such skill at once.
model_light claude-haiku-4-5-20251001 Lightweight model. Referenced as `model_key: "model_light"` by skills that prefer speed/cost over depth — generate_action_summaries, extract_applet_summary, refine_gate_entry, refine_module_entry. Keep on Haiku unless you want to promote those calls to Sonnet.
temperature 0.2 Sampling temperature for every api_call(). Skills can override; the default keeps outputs stable for reviewer comparison.
rate_limit_delay_seconds 35 Pause between consecutive api_call() invocations. Exposed as rate_delay() and called by every agent between skill steps to stay under Anthropic's per-minute rate limits.
vision_enabled True Global on/off for vision blocks on scorer calls (score_rubric_domain D/X, score_gates, scan_design_pitfalls). When false, scorers fall back to text-only and observed_visuals stays empty.
thumbnail_max_width_px 400 Max width (px) for rendered slide thumbnails passed to vision-enabled scorers. Read by tools/manifest.py::extract_thumbnails() at T5 via cfg.lever('infrastructure')['thumbnail_max_width_px']. Larger = better OCR fidelity for the vision scorers but more input tokens per slide image. 400 is the tested baseline; raise to 600-800 if the scorer starts hallucinating fine-grained visual content.
tool_use_required True When true (default), api_call() forces the model to emit the skill's tool via tool_choice={'type':'tool', 'name':tool_name}. When false, api_call() uses tool_choice={'type':'auto'} — the model MAY return free-form text, which existing skills will then raise on (they all parse tool_use blocks from the response). Flip only for experimental skills that explicitly handle non-tool responses.
skill_budgets
Default (unlisted skills): cap 16,384, model claude-sonnet-4-6 — Fallback for any unlisted skill — medium Sonnet output.
Skills (17):
Skill (tool_name) Cap Model Description
generate_action_summaries_result 16,384 claude-haiku-4-5-20251001 A2/A3 S9 — generate per-slide action summaries from compiled review + mastery for the T8 review HTML narrative sections. Haiku; 16k cap because action summaries are per-slide × per-gate × per-domain aggregate (S19: raised from 4k after hitting cap on G9C4M1, a 13-slide module).
extract_applet_summary 8,192 claude-haiku-4-5-20251001 A2 S4 — pre-process each applet storyboard into a compact module-linked summary before S5+ scoring. Haiku; 8k cap (S19: raised from 4k preemptively after G9C4M1 applets used 42% of the old cap).
refine_gate_entry 8,192 claude-haiku-4-5-20251001 LMS /api/gates/refine-entry — reviewer edits a single gate sub-entry (misconception, false-friend, slide row) with a preset + instruction. Haiku; 8k headroom for Phase 2 fields across refined entries.
refine_module_result 8,192 claude-haiku-4-5-20251001 LMS /refine-module (chapter module sequence gate) — reviewer splits, merges, or rewrites a module entry while preserving Phase 2 fields. Haiku; 8k.
insert_module_result 8,192 claude-haiku-4-5-20251001 LMS /refine-module operation=insert — reviewer inserts new module(s) between existing modules. Synthesises rich fields from KB evidence. Haiku; 8k.
assess_learning_coverage_result 16,384 claude-sonnet-4-6 A2/A3 S8 — learning coverage rubric against the compiled review: fluency, gaps, depth, transfer, residual misconceptions. Sonnet; 16k cap (S19: raised from 8k preemptively after G9C4M1 used 62% of the old cap on a 13-slide module).
derive_module_lo 8,192 claude-sonnet-4-6 A2 S1 — derive the module-level learning objective from CPA phase, scope, and parent chapter LO brief. Sonnet; small structured output.
derive_applet_lo 8,192 claude-sonnet-4-6 A3 S1 — derive the applet-level learning objective from the parent module LO and applet placement context. Sonnet; small structured output.
derive_lo_brief 16,384 claude-sonnet-4-6 A1 S1a.1 — chapter learning-objective brief from CONFIRMED_INPUTS + textbook / scope & sequence context. Sonnet; medium cap.
research_deep_synthesis 32,000 claude-sonnet-4-6 A2/A3 S2 — KB synthesis (vocabulary, misconceptions, false friends, textbook examples) grounding the blueprint build. Sonnet; raised 16384→32000 as safety margin on top of input slimming (S32 phase 1: VIR svg stripped, scope-aware VIR cap, chapter_research whitelisted, enrichment_history capped to top 3 events per component).
build_applet_screens 48,000 claude-sonnet-4-6 A3 S3 — applet 'ideal screens' blueprint with S23 structured fields (student_sees/does/thinks, intermediate_states, feedback, affordances, variants_and_invariants, time_estimate, difficulty, transitions). Sonnet; heavy cap (6-12 screens × rich structured fields).
textbook_extract 16,384 claude-sonnet-4-6 A0 textbook extractor — structures raw PDF page text into the textbook JSON schema consumed by A1. Sonnet; medium cap.
build_module_blueprint 48,000 claude-sonnet-4-6 A2 S3 — module 'ideal slides' blueprint with per-slide Phase 2 structural fields, textbook_alignment, vocabulary_introduction_plan. Sonnet; heavy cap (many per-slide fields).
score_rubric_domain_result 48,000 claude-sonnet-4-6 A2/A3 S6 — per-slide P / D / X rubric domain scoring. Vision scorer emitting observed_visuals + concepts_on_slide; carries visual-clutter directive. Sonnet; heavy cap.
score_gates_result 48,000 claude-sonnet-4-6 A2/A3 S5 — gate-level rubric scoring against the blueprint gates. Vision when thumbnails attached; carries visual-clutter directive. Sonnet; heavy cap.
scan_design_pitfalls_result 48,000 claude-sonnet-4-6 A2/A3 S7 — design pitfall scan across slides. Vision when thumbnails attached; carries visual-clutter + weak-hierarchy observation block. Sonnet; heavy cap.
derive_module_sequence 64,000 claude-sonnet-4-6 A1 S1a.2 — chapter-wide 25+ module sequence with Phase 2 annotations, textbook_validation, reconciliation_map, video_assignments. Heaviest call in the pipeline; hit 24k silently in Session 18. Sonnet; colossal cap.
pricing_per_million_tokens
{
  "claude-haiku-4-5-20251001": {
    "input": 0.8,
    "output": 4.0
  },
  "claude-opus-4-6": {
    "input": 15.0,
    "output": 75.0
  },
  "claude-sonnet-4-6": {
    "input": 3.0,
    "output": 15.0
  }
}
Per-model input/output price table used by api_call() to compute cost_usd for every step. Feeds the manifest per-step cost and the admin page Cost Summary.
retry_max_attempts 4 Max retries on transient API errors (overloaded_error, rate_limit_error, api_error) inside api_call()'s streaming loop.
retry_base_delay_seconds 30 Initial backoff before the first retry. Doubles on each attempt (exponential: base, 2×base, 4×base, 8×base).
agent_timeout_seconds 7200 Hard wall-clock cap per agent subprocess (a1_run, a2_run, a3_run, *_resume, *_rework). Enforced by _runner.py. Two hours default.

Lever: kb

config/levers/kb.json
KeyValueUsed where
embedding_enabled True Enables Voyage embeddings for semantic KB search. When false, kb_query falls back to substring filtering only (no vector search).
embedding_model voyage-4-lite Voyage model used to embed both KB text columns and kb_query's search text. Must match whatever the embedding_tables were indexed with — changing this requires re-embedding all rows.
embedding_dimensions 1024 Dimensionality of the embedding vectors. Must match the KB schema's vector columns. 1024 for voyage-4-lite.
embedding_tables misconceptions, interactive_landscape, visual_insight_records, classroots_topics, ias_patterns, cluster_mappings KB tables that support semantic search. kb_query's semantic leg is limited to these; other tables are filter-only.
embedding_similarity_threshold 0.5 Cosine similarity floor for semantic matches. Results below this are dropped before the LLM ever sees them.
default_limit 10 Default per-table result cap on kb_query. Small = tight context / low tokens but may miss evidence; large = richer context but more input tokens on every research skill. Applies to every A1/A2/A3 research call unless the caller passes a limit override.
healer_enabled True
healing_similarity_threshold 0.6

Lever: pedagogical

config/levers/pedagogical.json
KeyValueUsed where
verdict_thresholds
{
  "reject": 0,
  "rework": 50,
  "ship": 90,
  "upgrade": 75
}
Per-domain score percentages that map to Ship / Upgrade / Rework verdicts in score_rubric_domain. Tighten for stricter review.
cpa_strictness
{
  "challenge_module": "abbreviated",
  "learn_module": "full",
  "practice_module": "abbreviated"
}
min_visual_models_per_module 2
min_misconceptions_addressed 1
trope_requirements
{
  "exit_question": 1,
  "math_vault": 1,
  "snapshot": 3
}
applet_time_budget_minutes 7
module_time_budget_minutes 30