when this matters for bloggers
a finished blog post is public. the draft, the source interview, the half-formed voice note from a walk — those are not. the gap between "i recorded this" and "i hit publish" is where most blogger-specific privacy concerns live.
a few that come up:
- competitive intel risk. you cover a vertical. your draft mentions a company, a product, a take you haven't published yet. the audio file with that draft sitting on a transcription vendor's server is a leak surface — small, but real, and avoidable.
- embargo risk for industry bloggers. a source previewed news under embargo. you record a voice memo with your reaction so you don't lose the thought. uploading that memo to a third-party transcription service before the embargo lifts is the same leak risk a podcaster or journalist faces — just less obvious.
- source protection on niche beats. indie security bloggers, legal-beat writers, medical writers, harm-reduction writers. your sources expect that their audio doesn't move through additional vendors. removing the transcription vendor closes one avenue.
- "this is my draft, i want it to stay mine." you're not a journalist with a legal threat model. you just don't want your unpublished work sitting in someone else's data warehouse, indefinitely, under a TOS that can change.
workflow
- capture the audio. record in the browser with the voice recorder for voice notes and ad-hoc dictation, or upload an existing file from a phone, a field recorder, a zoom call, a conference session you saved.
- open audiohighlight, select private mode. the transcription runs locally in the browser via WebGPU + Whisper. nothing uploads.
- review the transcript. fix speaker labels for "me" and "interviewee" once each, and they propagate. word-level timestamps mean you click a word to replay the audio at that second — useful when you're verifying a quote before it goes in the post.
- export. .docx for editing in word or google docs, .md for editing in obsidian or directly into a static-site repo. the blog-post export collapses the speaker-by-speaker structure into a clean long-form draft you can edit into a finished post.
- close the tab. the audio file stays on your machine; the transcript is wherever you saved it. nothing on our side.
three jobs this fits
- voice notes from a walk. you have a thesis on a 20-minute walk. record it in the browser, transcribe it when you sit down, edit the transcript into the spine of a post. the dictation-to-draft loop without the audio leaving your laptop.
- interviews for a post. you call a source, a founder, a researcher. you record the call (with consent). private mode transcribes the file, you pull quotes, you cite. the source's audio doesn't move through a vendor that wasn't part of the agreement.
- talks, podcasts, conference sessions into posts. you spoke at a conference. you recorded a podcast. you have 45 minutes of usable thinking trapped in audio. transcribe the file, export blog-post format, edit it into a long-form piece. the original audio is your work product, not a vendor's.
where on-device fits and where it doesn't
- fits: pre-publication drafts and voice notes. anything where the audio leaving your machine adds a risk you'd rather not own.
- fits: a single laptop, a single post. a 30-minute interview transcribes in about 30 minutes on a current macbook. that's slower than cloud — but cloud isn't free either, and for one or two files at a time, the wait is fine.
- doesn't fit: batch-transcribing a back catalog. if you're processing 40 hours of old podcast episodes for an archive project, cloud mode is the right tool. on-device runs at roughly real-time per file; cloud is 5–10x faster.
- doesn't fit: live drafting during a call. we transcribe finished files. for live captions during a zoom interview, use a dedicated tool — that's a different product.
what we don't claim
we don't claim verbatim accuracy. for direct quotes that go in the post, listen to the audio with the transcript open and verify the ones that matter. the editor is built for this — click the word, hear the second.
we don't claim the on-device model handles all accents equally. english is best. for a non-english interview, cloud mode is more accurate; private mode degrades on languages outside the on-device model's strong set.
pricing
$0.25 per minute. a 30-minute interview costs $7.50; an hour of conference audio runs $15. private mode and cloud mode are the same price. no subscription, no minimum. for bloggers with a steady weekly cadence and a stable batch volume, write hello@audiohighlight.com and we'll work something out.