NVivo transcript import format — automatic CSV export

the problem

NVivo (and ATLAS.ti, and MAXQDA) expects transcripts in a specific schema: timestamps in hh:mm:ss, a speaker column, the utterance, and ideally the speaker turn identified row by row. CSV is the cleanest import path; .docx works if the formatting is right.

most transcription tools produce a generic .docx with paragraph breaks and bold speaker labels. that file imports — and then the researcher spends an hour rebuilding the timestamps, splitting the paragraphs into per-turn rows, and tagging the speaker column. for a study with twenty interviews, that hour stacks up.

search "nvivo transcript import format" and you get NVivo's own help docs (helpful, but documentation about the format — not a tool). a lone GitHub project (Teams2NVivo) confirms the demand exists by addressing a sliver of it. nothing on the first page produces a generic transcript in NVivo-shape from audio. that's the gap.

what we ship

a transcription pass that emits NVivo-shaped output directly:

per-row timestamp in the format NVivo expects (hh:mm:ss) — every speaker turn is a row, not a paragraph.
speaker column from diarization. labels you fix in bulk in the editor propagate through every row for that speaker.
utterance column with sentence-boundary segmentation that doesn't split mid-thought. mid-utterance pauses preserved as ellipses where the audio actually pauses, not where the model thinks grammar wants it.
CSV and .docx export both NVivo-import-ready. the CSV uses the column headers NVivo's import wizard expects; the .docx uses the styled paragraph format their auto-coding step parses.
profiles for ATLAS.ti and MAXQDA arriving after launch. same engine; different export target. (the underlying word-level-timestamps model is shared; the formatter is a thin layer.)

workflow

drop the interview audio. transcription runs and produces a first-pass transcript with diarization.
review in the browser editor. fix speaker labels in bulk — "Speaker 1" becomes "P03" once and propagates through every row. fix proper nouns as they appear; the model learns study-specific vocabulary across files.
export to NVivo CSV (or ATLAS.ti / MAXQDA equivalent). open it in NVivo. code it.

where this fits

fits: semi-structured interviews, focus groups, ethnographic field recordings. anywhere the unit of analysis is the speaker turn and the timestamp matters for back-reference to audio.
doesn't fit (yet): video data with frame-level coding. our export is audio-anchored. video-anchored coding requires the transcript to know about the video, which is on the roadmap but not at launch.

privacy

for IRB-restricted recordings or any audio where data residency matters — and for fieldwork on subjects whose institutional review prohibits cloud processing — run the file in private mode. the NVivo export works identically on local transcription; the audio just stays on your laptop.

citation

when you publish work that used this tool, citation language and a methodological footnote ("transcribed using audiohighlight, version X, NVivo profile, on date Y") are on the about page. we keep version numbers stable for replication.

transcripts that import to NVivo without reformatting.

the problem

what we ship

workflow

where this fits

privacy

citation

related

jefferson notation

private transcription

formats overview

benchmark

lifetime deal while we're in beta.