✍️ Editing Transcripts Before Rendering

Auto-transcription is accurate, but it won't always get brand names, people's names, or industry-specific terms right. This guide walks through editing the transcript before the video is rendered, so corrections appear in the final output without any post-production cleanup.

Flow overview

POST /videos/{videoId}/task   (autoApprove: false)
   ↓
GET  /videos/{videoId}/task/{taskId}   (poll until transcriptionCompleted)
   ↓
Download transcript JSON from the `transcript` URL
   ↓
Edit the word entries in your app/UI
   ↓
PUT  /videos/{videoId}/task/{taskId}/transcript   (save edits)
   ↓
POST /videos/{videoId}/task/{taskId}/approve-transcript   (trigger render)

Common mistake

POST /approve-transcript does not accept a request body. It just triggers rendering using whatever transcript is currently stored on the task. If you skip the PUT step and send your edits to approve-transcript, the edits are silently ignored and the video renders with the original output.

Step 1: Create the task with `autoApprove: false`

See API Reference: Create Video Task

curl -X POST "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "templateId": "YOUR_TEMPLATE_ID",
    "autoApprove": false,
    "language": "en"
  }'

Save the returned taskId.

Step 2: Poll until transcription completes

curl -X GET "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task/YOUR_TASK_ID" \
  -H "x-api-key: YOUR_API_KEY"

Wait for status to become transcriptionCompleted. A 2–5 second poll interval is fine.

The response includes a transcript field pointing to a signed URL where the generated transcript JSON can be downloaded.

Step 3: Download and edit the transcript

Download the transcript JSON from the URL returned in Step 2. It's an array of word entries:

[
  {
    "text": "Acme",
    "type": "word",
    "start_time": 0.12,
    "end_time": 0.48,
    "confidence": 0.82
  },
  {
    "text": "Corp",
    "type": "word",
    "start_time": 0.52,
    "end_time": 0.86,
    "confidence": 0.77
  }
]

Each entry has:

text (string) — the word itself
type ("word" | "punctuation")
start_time / end_time (number, seconds)
emoji (string, optional)
important (boolean, optional — flags the word for highlight rendering)
fontId (string, optional — per-word font override; see Custom Fonts)

Present this array to your editing UI. Users typically only change text, but all fields except confidence can be updated.

Step 4: PUT the edited transcript

See API Reference: Update Transcript

Send the edited array back as the request body:

curl -X PUT "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task/YOUR_TASK_ID/transcript" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d @edited-transcript.json

Entries must stay time-ordered and non-overlapping: each end_time >= start_time, and each entry's start_time must be >= the previous entry's end_time.

Step 5: Approve and render

See API Reference: Approve Transcript

curl -X POST "https://api.zapcap.ai/videos/YOUR_VIDEO_ID/task/YOUR_TASK_ID/approve-transcript" \
  -H "x-api-key: YOUR_API_KEY"

No body. This promotes the task out of the transcriptionCompleted state and kicks off rendering with the transcript you just saved. Poll GET /videos/{videoId}/task/{taskId} again until status === 'completed', then download from downloadUrl.

Node.js example

const BASE = "https://api.zapcap.ai";
const headers = {
  "x-api-key": process.env.ZAPCAP_API_KEY,
  "Content-Type": "application/json",
};

async function captionWithEdits(videoId, templateId, editFn) {
  // 1. Create task
  const { taskId } = await fetch(`${BASE}/videos/${videoId}/task`, {
    method: "POST",
    headers,
    body: JSON.stringify({ templateId, autoApprove: false, language: "en" }),
  }).then((r) => r.json());

  // 2. Poll until transcribed
  let task;
  while (true) {
    task = await fetch(`${BASE}/videos/${videoId}/task/${taskId}`, {
      headers,
    }).then((r) => r.json());
    if (task.status === "transcriptionCompleted") break;
    if (task.status === "failed") throw new Error("Transcription failed");
    await new Promise((r) => setTimeout(r, 3000));
  }

  // 3. Download transcript
  const transcript = await fetch(task.transcript).then((r) => r.json());

  // 4. Let the caller edit it
  const edited = await editFn(transcript);

  // 5. PUT edits
  await fetch(`${BASE}/videos/${videoId}/task/${taskId}/transcript`, {
    method: "PUT",
    headers,
    body: JSON.stringify(edited),
  });

  // 6. Approve
  await fetch(`${BASE}/videos/${videoId}/task/${taskId}/approve-transcript`, {
    method: "POST",
    headers,
  });

  return taskId;
}

Tips

Pre-seed corrections. If you already know the brand names and jargon that is often wrong, pass them via the dictionary field on POST /task to improve first-pass accuracy. This reduces how much your reviewers need to edit.
Bring your own transcript. If you already have word-level timing from another system, pass it via the transcript field on POST /task and skip transcription entirely.
Per-word styling. Setting important: true on a word makes the template apply its highlight style. Per-word fontId also works for mixing fonts within a single caption.

✍️ Editing Transcripts Before Rendering

Flow overview​

Step 1: Create the task with autoApprove: false​

Step 2: Poll until transcription completes​

Step 3: Download and edit the transcript​

Step 4: PUT the edited transcript​

Step 5: Approve and render​

Node.js example​

Tips​