✍️ Editing Transcripts Before Rendering
Auto-transcription is accurate, but it won't always get brand names, people's names, or industry-specific terms right. This guide walks through editing the transcript before the video is rendered, so corrections appear in the final output without any post-production cleanup.
Flow overview
POST /videos/{videoId}/task (autoApprove: false)
↓
GET /videos/{videoId}/task/{taskId} (poll until transcriptionCompleted)
↓
Download transcript JSON from the `transcript` URL
↓
Edit the word entries in your app/UI
↓
PUT /videos/{videoId}/task/{taskId}/transcript (save edits)
↓
POST /videos/{videoId}/task/{taskId}/approve-transcript (trigger render)
POST /approve-transcript does not accept a request body. It just triggers rendering using whatever transcript is currently stored on the task. If you skip the PUT step and send your edits to approve-transcript, the edits are silently ignored and the video renders with the original output.
Step 1: Create the task with autoApprove: false
curl -X POST "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task" \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"templateId": "YOUR_TEMPLATE_ID",
"autoApprove": false,
"language": "en"
}'
Save the returned taskId.
Step 2: Poll until transcription completes
curl -X GET "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task/YOUR_TASK_ID" \
-H "x-api-key: YOUR_API_KEY"
Wait for status to become transcriptionCompleted. A 2–5 second poll interval is fine.
The response includes a transcript field pointing to a signed URL where the generated transcript JSON can be downloaded.
Step 3: Download and edit the transcript
Download the transcript JSON from the URL returned in Step 2. It's an array of word entries:
[
{
"text": "Acme",
"type": "word",
"start_time": 0.12,
"end_time": 0.48,
"confidence": 0.82
},
{
"text": "Corp",
"type": "word",
"start_time": 0.52,
"end_time": 0.86,
"confidence": 0.77
}
]
Each entry has:
- text (
string) — the word itself - type (
"word" | "punctuation") - start_time / end_time (
number, seconds) - emoji (
string, optional) - important (
boolean, optional — flags the word for highlight rendering) - fontId (
string, optional — per-word font override; see Custom Fonts)
Present this array to your editing UI. Users typically only change text, but all fields except confidence can be updated.
Step 4: PUT the edited transcript
Send the edited array back as the request body:
curl -X PUT "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task/YOUR_TASK_ID/transcript" \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d @edited-transcript.json
Entries must stay time-ordered and non-overlapping: each end_time >= start_time, and each entry's start_time must be >= the previous entry's end_time.
Step 5: Approve and render
curl -X POST "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task/YOUR_TASK_ID/approve-transcript" \
-H "x-api-key: YOUR_API_KEY"
No body. This promotes the task out of the transcriptionCompleted state and kicks off rendering with the transcript you just saved. Poll GET /videos/{videoId}/task/{taskId} again until status === 'completed', then download from downloadUrl.
Node.js example
const BASE = "https://api.zapcap.ai";
const headers = {
"x-api-key": process.env.ZAPCAP_API_KEY,
"Content-Type": "application/json",
};
async function captionWithEdits(videoId, templateId, editFn) {
// 1. Create task
const { taskId } = await fetch(`${BASE}/videos/${videoId}/task`, {
method: "POST",
headers,
body: JSON.stringify({ templateId, autoApprove: false, language: "en" }),
}).then((r) => r.json());
// 2. Poll until transcribed
let task;
while (true) {
task = await fetch(`${BASE}/videos/${videoId}/task/${taskId}`, {
headers,
}).then((r) => r.json());
if (task.status === "transcriptionCompleted") break;
if (task.status === "failed") throw new Error("Transcription failed");
await new Promise((r) => setTimeout(r, 3000));
}
// 3. Download transcript
const transcript = await fetch(task.transcript).then((r) => r.json());
// 4. Let the caller edit it
const edited = await editFn(transcript);
// 5. PUT edits
await fetch(`${BASE}/videos/${videoId}/task/${taskId}/transcript`, {
method: "PUT",
headers,
body: JSON.stringify(edited),
});
// 6. Approve
await fetch(`${BASE}/videos/${videoId}/task/${taskId}/approve-transcript`, {
method: "POST",
headers,
});
return taskId;
}
Tips
- Pre-seed corrections. If you already know the brand names and jargon that is often wrong, pass them via the
dictionaryfield onPOST /taskto improve first-pass accuracy. This reduces how much your reviewers need to edit. - Bring your own transcript. If you already have word-level timing from another system, pass it via the
transcriptfield onPOST /taskand skip transcription entirely. - Per-word styling. Setting
important: trueon a word makes the template apply its highlight style. Per-wordfontIdalso works for mixing fonts within a single caption.