the chain we're giving away.
there's something genuinely magical about the combination: ffmpeg → LLM → time-aligned transcript → remotion. ffmpeg gives us local file processing. an LLM layer adds semantic judgment to the structured output (silence regions, loudness curves, scene cuts). the transcript ties the audio to time-anchored words. remotion turns the whole thing into rendered video output — captioned clips, highlight reels, audiograms — without uploading anything to anyone.
we use this stack to build audiohighlight. the same stack, surfaced as a row of free tools, is the demo. anyone who runs one of these and gets back a finished file understands what audiohighlight is for.
tier 1 — ffmpeg in your browser
pure ffmpeg.wasm wrappers. file conversions, format manipulation, the things people pay clideo / 123apps / clipchamp subscriptions for — except local, free, and never uploaded.
voice recorder
liverecord audio from your microphone. download as webm, mp3, or wav. nothing uploads.
extract audio from video
livedrop a video, get an mp3 back. ffmpeg.wasm in your browser. nothing uploads.
tier 2 — ffmpeg with semantic judgment
ffmpeg gives you structure (silence regions, loudness curves). an LLM tells you which silences are dead air vs. intentional pauses, which loudness peaks are music vs. dialog. the combination is more useful than either alone. shipping next.
— coming soon —
tier 3 — the transcript layer
add a time-aligned transcript and the tools sharpen further: chapter markers based on topic shifts, quotable moments ranked by emotional weight, automatic redaction of named entities. this is where the chain starts becoming clearly differentiated from anything else in the category.
tier 4 — remotion + finished video
the full chain. paste a transcript range, get back a captioned audiogram. let the LLM pick the three most quotable moments, get back a 60-second highlight reel. captions burned in, music bed under, brand colors applied — all rendered locally, no upload. shipping after the tier-3 tools land.
— coming soon —
why we give these away.
three reasons, roughly equally weighted:
- they're useful. people need format conversions, transcripts, and captioned clips all the time. the existing options upload your file to someone else's server. ours don't. that's a real improvement we'd want to use ourselves.
- they're the demo. the easiest way to understand audiohighlight is to use a piece of the chain. someone who runs the highlight-reel tool on a podcast and gets back a finished captioned video instantly understands what the paid product is for.
- they reinforce the privacy posture. we tell journalists, lawyers, and therapists that the audio stays on their machine. saying it on a marketing page is cheap. shipping a row of free tools that demonstrably run locally is proof. the audit takes five minutes — we wrote up how.