how to · 4 min read

transcribe a Zoom recording without putting a bot in your meeting.

most Zoom transcription tools want to join the call as a participant. some of your meetings can't have a bot in them. here's how to transcribe the recording cleanly, after the fact.

two ways Zoom recordings exist

when you record a Zoom meeting, the file lands in one of two places, and the transcription workflow differs slightly:

either kind of recording is just an audio file once it exists. drop it in, transcribe it, edit, export.

the workflow

  1. locate the recording. for cloud recordings: zoom.us → recordings → download. for local recordings: ~/Documents/Zoom/[meeting-name]/ on mac, %USERPROFILE%\Documents\Zoom\ on windows.
  2. drop the file into audiohighlight. mp4, m4a, mp3, wav, webm — anything Zoom exports works. video files have audio extracted automatically.
  3. transcription runs. on a 60-minute zoom recording, the first pass is ready in 1–3 minutes (cloud mode) or roughly real-time (on-device private mode for sensitive meetings).
  4. fix the speaker labels. zoom doesn't pass speaker identity through to the file — the diarization is what we infer from voice patterns. relabel "speaker 1" to the actual person's name, once. propagates through every row.
  5. verify quotes against the recording. click any word in the transcript, hear that second of audio. for any meeting whose transcript becomes evidence — performance reviews, candidate interviews, product decisions, customer-success cases — this is the verification step.
  6. export. .docx for the meeting notes that go to the team. .srt or .vtt for adding captions to the recording before sharing. plain text for paste-into-doc workflows.

why no bot

the dominant workflow for zoom transcription in 2026 is a bot — otter, fireflies, fathom, granola — that joins your meeting as a participant and transcribes live. for many internal team meetings, that's a perfectly good choice. for a meaningful subset of meetings, it isn't:

for any of those, the workflow that works is: you record the call yourself (zoom's local recording does this), and you transcribe the file after, using a tool that doesn't need to be in the meeting.

private mode for sensitive Zoom recordings

for the meetings above — medical, legal, journalism, investigation, board — even uploading the recording to a cloud transcription tool after the fact can be a problem. the audio sits on the vendor's servers; it's reachable through process the way any vendor-held document is reachable.

private mode runs the speech-recognition model in your browser using WebGPU. you drop the recording into the editor and the model transcribes locally — your audio never makes a network request, never reaches our servers, never sits in any third-party storage. for the structural argument and the audit instructions, see private transcription.

handling Zoom's quirks

pricing for Zoom recordings

$0.25 per minute. a 30-minute zoom call is $6. a 60-minute team meeting is $15. private mode and cloud mode are the same price. no subscription, no minimum. for teams with steady weekly meeting volume, batch pricing arrives after launch.

related

lifetime deal while we're in beta.

join the waitlist to get a lifetime deal — your first month free, plus 50% off forever. private invite when we ship; no drip campaign.