who's transcribing interviews, and why
"transcribing an interview" is the highest-volume buyer search in the transcription category, and the buyers are structurally different from each other. four common shapes:
- journalists transcribing source interviews for stories. the job ends with a quote in print. accuracy on that quote is non-negotiable; the rest of the transcript is search material to find the quote in.
- qualitative researchers transcribing interviews for coding in NVivo, ATLAS.ti, or MAXQDA. the job ends with the transcript imported into a CAQDAS package and segmented for coding. format matters as much as accuracy.
- hiring managers transcribing candidate interviews for hiring committees, scorecards, and reference checks. accuracy on competency answers matters; the transcript becomes evidence.
- product / UX researchers transcribing user interviews for synthesis, jobs-to-be-done mapping, and clip libraries. the job ends with insights shared back to the product team, often with timed clips.
all four jobs end the same way: a transcript whose citable moments need to be findable and verifiable fast. the rest of the cleanup workflow varies by audience.
the workflow
- upload the recording. phone-quality audio, lapel mic, zoom recording, in-person recorder — anything ffmpeg can read. mp3, m4a, wav, mp4 (audio extracted), webm (zoom), flac.
- transcription runs. on a 60-minute interview, the first pass is ready in 1–3 minutes (cloud mode). on-device private mode runs at roughly real-time, and is the right choice for interviews with anonymous sources, sensitive material, or anything under an NDA.
- fix labels in bulk. "speaker 1" becomes "interviewer", "speaker 2" becomes the source's name (or pseudonym, or participant ID). one click propagates through every turn. proper nouns the source mentioned (companies, people, technical terms) are fixed once and remembered across future interviews in the same project.
- verify quotes against audio. the editor links every word to its second of audio. find a quote you want to use, click the first word, listen back. agree or disagree with the model's transcription, type the correction. for any quote that's going to print, this verification step is non-negotiable; we built the editor to make it ten seconds instead of ten minutes.
- export to your downstream tool. .docx for journalism workflows. NVivo CSV / ATLAS.ti / MAXQDA for qualitative researchers. plain text for scorecards and committees. .srt or json for video clip libraries. all from the same transcript, no re-transcription.
the privacy case for sensitive interviews
some interviews can't sit on a third-party transcription server:
- anonymous sources in journalism, where the audio's existence on a vendor's backups is itself a risk
- NDA-bound interviews in user research with pre-launch products
- IRB-restricted research with data-residency rules
- internal-investigation interviews where the audio is potentially privileged
- candidate interviews in jurisdictions with strict consent and data-handling requirements
for those, run the file in private mode. the model runs in your browser using WebGPU; the audio file makes no network request. there's no vendor in the chain. the audit takes five minutes — here's how.
quote verification — the actual job
for any interview transcript that ends with a published quote, the most expensive task in the workflow isn't transcription — it's verification. the moment you pull a quote, you have to listen to the original audio to confirm it's accurate. AI models are confidently wrong often enough that you can't skip this; in fact, the most embarrassing journalism retractions of the past five years all involve un-verified AI-transcribed quotes.
on most tools, verification means: open the audio file in a separate program, scrub the timeline to roughly the right place, listen, agree or disagree, type the correction. for a 30-minute interview with 20 quotes you care about, that's 8–10 minutes of context-switching.
the editor here knows where each word is in the audio. you click the first word of the quote and the audio plays from that exact second. you don't switch programs. the verification step takes ten seconds. for the broader argument about why this matters, see the cleanup tax post.
which tool fits your interview type
- journalism interview, sensitive source: private mode. journalism cluster page has the structural argument.
- qualitative research, multi-interview study: choose your CAQDAS export profile. NVivo, ATLAS.ti, or MAXQDA.
- conversation analysis, sociolinguistic work: jefferson notation export with timed pauses and overlap brackets.
- candidate / hiring interview: cloud mode is fine; .docx export for the scorecard. on-device for jurisdictions with strict consent / data-handling rules.
- product / UX research interview: private mode for NDA-bound interviews; .docx + json export for synthesis tools that ingest both.
pricing for interviews
$0.25 per minute. a 30-minute interview is $6. a 60-minute interview is $15. private mode and cloud mode are the same price. no subscription, no minimum. for research projects with 20+ interviews, batch pricing arrives after launch.