
From confused user to AI-tool-ready spec in 90 seconds. Walk through any page, comment on what's wrong, and we hand you back a Cursor task, a Lovable prompt, a v0 design brief, a Claude.md spec, and structured JSON. Same recording, five outputs.
You spot a bug in your own product. Or a customer Slacks you a screen-recording. Or you watch a Lovable preview and think "the menu's wrong, the copy's confusing, the empty state needs a button." Then you sit down to write the bug report.
That's where the loss happens. You translate the live experience into a Linear ticket, then translate the ticket into a Cursor prompt, then debug what the AI wrote because half the context dropped along the way. You did the thinking three times. Two of those times the AI didn't see what you saw.
Walkthrough as Spec collapses those three translations into one. The recording is the spec. Every click, every keystroke, every comment — captured with full DOM context, then handed straight to whichever AI tool you build with.
A floating button on any dashboard page. One click starts capture — no screen-share dialog, no install.
Click around like a normal user. Cmd/Ctrl+Click an element to inspect — outerHTML, accessibility tree, two levels of parent context, all captured.
Cmd/Ctrl+/ on any element opens a comment input. "This label is wrong." "Empty state needs a button." Bound to that exact element + timestamp.
We generate five spec formats in parallel. Open the format that matches your tool. Paste. Watch the fix land.
Different AI tools want different inputs. Cursor is happiest with structured YAML. Lovable wants natural-language paragraphs that reference what the user sees. v0 prefers design-first prose. Claude wants markdown with citations. JSON ingests anywhere. We generate all five from the same underlying capture.
task: Fix empty state on /dashboard/leads
context: |
User opened /dashboard/leads
with no leads. The empty
state shows "No data" but
no Add button. Friction at
4.2s; user clicked twice
on the heading expecting
it to be interactive.
acceptance_criteria:
- Empty state shows
"Add your first lead"
button
- Button opens existing
AddLeadDrawer
- Heading is not
clickable
files_to_modify:
- src/pages/Leads.tsx
- src/components/leads/
EmptyState.tsx
On the Leads page, when there are no leads yet, the empty state currently shows the text "No data" but doesn't give the user any way to add their first lead. The user also clicked the page heading twice expecting it to do something — make sure the heading isn't styled like a button. Add a clear "Add your first lead" button below the empty state message that opens the existing add-lead drawer. Build this fix without changing unrelated files.
An empty state for a leads table. Centered layout, friendly copy, a single primary CTA button labeled "Add your first lead" in brand orange. Soft icon above the text. Subtle secondary line of muted copy below the button: "It only takes 30 seconds." Component: LeadsEmptyState
## Product context The /dashboard/leads page shows a list of leads. Empty when the user has just signed up. ## What the user did At 0.0s the user navigated to /dashboard/leads. At 2.1s they clicked the heading "Leads" twice. At 4.2s they commented: "where do I add one???" ## Suggested fix path Add an "Add your first lead" CTA to the empty state...
{
"summary": "Empty state
on /dashboard/leads
has no add CTA",
"user_intent": "create
first lead",
"friction_points": [
{
"timestamp_ms": 4200,
"description": "user
comment: 'where do
I add one???'",
"severity": "high"
}
],
"events_count": 12,
"target_components": [
"LeadsEmptyState"
]
}
Every recording is also classified for issue type and severity. Console errors during the walkthrough become ui_bug or data_error. Slow page transitions (>3 seconds) become slow_load. Broken links (404s) become broken_link. Repeated clicks on the same element become confusing_copy. User comments — "this label is wrong", "I expected data here" — become confusing_copy, missing_feature, or ui_bug depending on context. Each issue gets a severity score (critical / high / medium / low / info) and a recommended fix.
Result: a spec for the AI tool to build with, AND a triage list for you to prioritize.
You're building a product. Lovable just generated the dashboard. Three things are wrong but you can't be bothered to type them out. Hit Record, walk through the dashboard, comment on what's broken, stop. The Lovable prompt format is already in your clipboard. Paste it into the next iterate-on-Web turn. Lovable fixes the three things at once.
Your client recorded a walkthrough on staging. They mentioned eight issues. Instead of relaying the recording into a 600-word ticket, you copy the JSON output, paste it into Linear's API, and eight tickets land — each with severity, location, recommended fix, and the original comment quoted verbatim.
One designer uses v0. One engineer uses Cursor. The PM uses Claude. Same recording, three formats, three workflows — no translation overhead. The recording is the source of truth; the formats are derived views.
v0 captures: every click event with the target's selector + xpath + outerHTML + accessibility-tree snippet, two levels of parent context, console errors, navigation events, scroll position deltas, comment text bound to the focused element + timestamp.
v0 does not yet capture: video of the screen, audio voice-notes. Both are coming in v0.2 — the underlying schema already supports them. For v0, comments via Cmd/Ctrl+/ replace voice-notes; the AI gets a textual proxy of what you'd have said out loud.
What's never captured: passwords (we strip type=password input values before persistence), aria-hidden=true trees, anything inside iframes you don't own. RLS at the database level guarantees only you (and admins, when explicitly authorized) can read your recordings.
Loom captures pixels. We capture structure. A Loom video is a black box to an AI tool — it can't extract the selector you clicked or the comment you bound to a specific element. Our recording is a structured event timeline with full DOM context, then we use that timeline to generate AI-shaped specs. The video (when v0.2 ships) becomes a supplementary artifact, not the primary one.
v0 is in-app only — the overlay lives inside the MentionFox dashboard, so you record yourself walking through MentionFox features (great for "this thing is broken" feedback to us). For your own product, embed the same overlay component (open-source-friendly licensing for Pro+ customers). For walking through third-party sites, the Tauri Desktop App ingest path lands in a future version — not v0.
Claude Sonnet (sonnet-4-7 at launch). One call per format, one call for issue detection. Total cost per recording in API tokens runs $0.05–$0.20 depending on length. The generation runs server-side; you never see API keys.
Yes. Each format opens in a copy-friendly view with a Copy button. Paste it anywhere — including into your own editor for a final review. We don't lock the spec; the recording is the canonical artifact and the spec is a derived view you can regenerate.
Pro+ recordings are user-scoped via RLS by default. Sharing is on the roadmap (v0.3 — share-token + read-only viewer). Agency tier already supports cross-seat reads within a single agency org.
v0 (today): DOM event capture, element inspection, inline comments, five spec formats, issue detection, admin + end-user dashboards.
v0.2 (next): Screen video capture (getDisplayMedia), audio capture (getUserMedia), Whisper transcription of audio into the spec context, video upload to Supabase Storage, video timeline scrub on the admin viewer.
v0.3: Share tokens with read-only viewer, agency cross-seat reads, embed mode for customer-facing products (your customers record on your site, you get the spec), webhook delivery to Linear / GitHub Issues / your own endpoints.
v1.0: Tauri Desktop App ingest, third-party site capture, native macOS/Windows recording with full audio + system events.