The Pipelines

How Vid works.

Two distinct pipelines — one self-serve, one managed. Both cinematic.

Pipeline 1 — Self-Serve

VID Automated Workflow

Topic → Claude script → HeyGen v3 avatar video · 1,050 cr · 20–45 min

Brief — Topic Input

›

Topic field — Anything from a product name to a full content brief — Claude fills the gaps

›

Style selector — Professional, Casual, Educational, or News broadcast tone

›

Avatar picker — Choose from HeyGen v3 public avatar library — 50+ photorealistic presenters

›

Voice selector — 100+ voices by language, gender, and character — or use avatar default

›

Credit check — 1,050 credits reserved upfront — refunded automatically on failure

Script — Claude Sonnet 4.6

›

Research pass — Claude builds context on the topic — facts, angles, structure

›

Structured JSON output — Script sections with section titles, body text, lower-third cues, and duration estimate

›

Style blocks — Each style maps to a visual directive injected into the HeyGen prompt (e.g. GEOMETRIC BOLD, RED WIRE, SWISS PULSE)

›

Duration target — Word count → speech rate → seconds estimate. Full script passed to HeyGen as creative direction, not verbatim lines

›

CTA generation — Clear call to action at end of every script

Render — HeyGen v3 Video Agent

›

POST /v3/video-agents — Prompt-based — not slide-by-slide. HeyGen's AI director interprets the full script and style block

›

B-roll + motion graphics — HeyGen adds stock footage, animated counters, and transitions automatically based on content

›

Two-step polling — Session ID → video_id assigned → video completed. Wall time 20–45 min. 540 polls × 5s = 45 min max

›

Landscape 1080p — 16:9 output, production quality, downloadable MP4

›

Credits on completion — 1,050 credits deducted only after successful render — no charge on failure

Pipeline 2 — Managed Service

Catalog Video Pipeline

Still image → Seedance motion → multi-shot montage · Fashion & editorial · From €499/run

Anchor Image — Visual Foundation

›

FLUX.1 Dev / Pro — Text-to-image with portrait_4_3 format, 28 inference steps

›

HY-WU Try-On — Virtual garment try-on for fashion catalog accuracy

›

IP-Adapter — Face-identity-preserving generation from model anchor

›

Visual Director — Claude builds a structured 10-field prompt from rough direction

›

MIRE mapping — 7 layers: Environment, Light, Colour, Props, Pose, Composition, Mood Tags

›

Character sheet — 4-angle reference grid (front, 3/4, profile, over-shoulder) for consistency locking

›

Realism Phase 1 — Media type profile injected at image stage — VHS, 16mm, documentary, etc.

Image → Video — Motion Layer

›

Seedance — Image-anchored video generation, consistency 65, duration 4–15s

›

34 Camera Movements — 8 categories: static, horizontal, vertical, depth, circular, aerial, handheld, complex

›

suggestMovement() — MIRE emotional intent → closest camera movement via keyword matching

›

ElevenLabs dialogue — Detects quoted speech → extracts lines → generates audio → passes @Audio1 to Seedance

›

Async 202 pattern — Returns immediately, Seedance runs fire-and-forget, client polls /video-status every 5s

›

Golden Negative Prompts — No subtitles, no music, no text overlays, no title cards, no watermarks

›

Realism Phase 2–3 — Composition rules: close-up fills 60%+ frame, max 3 subjects, simple continuous motion

Multi-Shot Montage — Cinematic Assembly

›

Shot list scripting — Claude scripts ELS→LS→MLS→MS→MCU→CU→ECU→LA→HA sequence with hero shot marking (★ HERO)

›

Camera vocabulary — All 34 movements injected into system prompt — Claude picks per-shot

›

MIRE per-shot hints — Cinematography, context, style, audio atmosphere injected as labeled fields

›

Seedance parallel — Each shot rendered independently with shared Image1 identity anchor

›

Flat prompt assembly — Shots joined with [Shot cut] separators into single Seedance-ready prompt

›

Grid shot picker — Select any subset of 9 shot types — FLUX generates them in parallel

Sound Studio

›

Suno AI music generation — descriptive or custom lyrics + tags + title

›

MIRE auto-style: mood profile → musical genre tags automatically

›

buildMusicPrompt() → full sentence from environment + light + colour + mood

›

Waveform visualizer via Web Audio API OfflineAudioContext

›

A/B comparison — both Suno tracks shown side by side

›

Extend to match video duration — repeats from start via /sound/extend

›

Brand Sound library — job-level signature track stored in metadata

›

Attach to scene → audio_url travels with unit into video pipeline

Skin Studio

›

Magnific AI (via Freepik) — creative and precision upscale modes

›

Engines: Illusio (photorealistic), Sharpy (high detail), Sparkle (editorial)

›

5 presets: Glass Skin, Natural Editorial, High Fashion, Authentic Raw, Dewy Commercial

›

4 sliders: Pore Texture (creativity), Definition HDR (hdr), Fidelity (resemblance), Depth Detail (fractality)

›

Skin character: Finish × Undertone × Pore Preset × Condition

›

Toggles: SSS Glow, Authenticity Seeds, 2-Pass Pipeline, Set as Approved

›

Before/After split viewer with download link

Start VID Workflow Book Catalog Session

Studio Munich Assistant