← Back to home

🤖 Auto Vietnamese Captions — AI Whisper + Claude for Video

Auto-generate SRT/VTT captions for Vietnamese videos with Whisper large-v3 + Claude Haiku fixing diacritics, names, slang. Standard SRT for TikTok/YouTube. Free, no signup.

Whisper VNClaude diacritic fixSRT/VTT/TXTFree
📁

Drag & drop file here or click to choose

Max 500MB. Supports MP4, MOV, WebM, AVI, MKV · Tối đa 60 phút

Whisper self-host (CPU) — video 10 phút mất ~5-10 phút. Sau đó Claude tự fix dấu + tên riêng tiếng Việt.

Why use this tool

🇻🇳
Vietnamese + Claude post

Whisper transcribes → Claude Haiku fixes diacritic errors, proper names (Hà Nội, Sài Gòn), Gen-Z slang → clean SRT.

🆓
Free vs Submagic $29/mo

Unlike Submagic paywall, this tool is free using self-hosted Whisper on server CPU.

📥
3 output formats

SRT for TikTok/YouTube, VTT for HTML5 web, TXT plain for copy-paste.

How to use

  1. 1Upload Vietnamese video (max 60 min).
  2. 2Wait for Whisper transcribe (5-10 min for 10-min video).
  3. 3Claude auto-fixes diacritics + names.
  4. 4Download SRT/VTT/TXT.

AI Vietnamese captions — how

Tool uses Whisper large-v3 self-hosted on homeserver — OpenAI open-source multi-lingual model. For Vietnamese: ~85-90% accuracy (clear speech), drops to ~70% on heavy regional accents or loud background music.

After Whisper, the tool sends raw SRT through Claude Haiku 4.5 with prompt: 'Fix Vietnamese tone marks, proper names (Hà Nội instead of ha noi), Gen-Z slang (chill/flex/sus), preserve timestamps'. Output is a ready-to-use clean SRT.

Caveat: Whisper on CPU is slow ~5x realtime. A 10-min video takes ~5-10 min to process. Progress bar shown.

  • Whisper large-v3 self-host
  • Claude Haiku fixes VN
  • SRT/VTT/TXT export
  • Auto-delete files after 60 min
  • Free no signup
  • Max 60 min video

FAQ

Is Whisper 100% accurate?

No, ~85-90% for clear Vietnamese. Always proofread for professional video work.

Does English video work too?

Yes. Whisper is multi-lingual. Tool auto-detects language. English accuracy higher ~92-95%.

Why 5-10 min wait?

Whisper self-host runs on CPU (no GPU). For faster, need a paid API — Phase 2 will offer.