Human overview · for understanding
See exactly which skill spent the money — and what one minute of TTS really costs · 2026-06-22
See exactly which skill spent the money — and what one minute of TTS really costs
Master summary — the gist in 30 seconds
Input: every AI call's real token counts + the price table you already have. Output: a tab that says 'Reply drafting spent $X, Enrichment $Y, TTS $Z (= $0.04 per audio minute)', so you instantly see what's expensive.
flowchart LR A["Every AI call<br/>(app + laptop skills)"] --> B["usage.record<br/>tokens + cost + unit"] B --> C["keyed rows in<br/>the store"] C --> D["AI Usage tab<br/>per skill / per category"] D --> E["$/min TTS<br/>$/lead · $/call"]
Input: any Claude/Gemini call in the app. It all flows through one client (extract.AI), which already records tokens x price into usage.py. Output: a flat 'AI Usage' table (task, calls, tokens, cost) on the dash today.
flowchart TD X["draft reply · extract · enrich · eval"] --> AI["extract.AI client<br/>(one funnel)"] AI --> L["_log -> usage.record"] L --> S["store: usage:date:promptkey"] S --> T["dash AI Usage table"]
Input: today's flat per-task log. Gaps: (1) no category grouping, (2) some calls log as 'unknown', (3) no 'unit' field, (4) laptop skills (/TTS, finish-session-tts) log nothing, (5) the TTS model isn't even in the price table. Output once fixed: a real per-skill/category + unit-economics view.
mindmap
root((5 gaps))
No category dimension
'unknown' holes
No unit field
Laptop skills unlogged
TTS price missing
Input: /TTS finishes and already knows the audio's duration in seconds + the model. Output: it POSTs {model, tokens/seconds, unit} to a small token-guarded /dash/api/usage/ingest endpoint, which writes it into the same store as the app's calls. One shared list, two sources.
flowchart LR TTS["/TTS on laptop<br/>knows duration_s"] -->|POST + dash token| IN["/dash/api/usage/ingest"] IN --> S["same store"] APP["app AI calls"] --> S S --> TAB["AI Usage tab"]
Input: each call logs its unit — audio seconds for TTS, lead count for enrich, a transcript for call-parse. Output: $/minute of TTS = total TTS cost / (total seconds / 60); same shape for $/lead and $/call.
flowchart TD C["TTS cost (sum $)"] --> R["divide"] U["audio seconds / 60"] --> R R --> P["$ per minute of audio"]
Input: one API key + our own labels (category, unit) on every call. Output: an instrumented estimate that IS the live truth on the tab. Periodically you open the Gemini/Anthropic billing console (via Chrome DevTools) and check our total against theirs to keep it honest.
flowchart LR K["ONE Gemini key"] --> I["label every call<br/>category + unit"] I --> E["instrumented estimate<br/>(LIVE truth)"] E -. monthly compare .-> B["billing console<br/>(reconcile anchor)"]
Input: this handoff (intent locked, gaps named, defaults proposed). Output: Instance 2 researches + writes checklist.md + ADRs for the 7 deferred technical choices; Instance 3 executes and QA-verifies the new tab, sentinel-gated.
timeline title Planning chain Instance 1 : Lock intent (this doc) Instance 2 : Research + checklist + ADRs Instance 3 : Build + pixel-verify the tab