Which languages does lip sync support?

Seedance 2.0 supports phoneme-level lip sync for English, Chinese, Spanish, Japanese, French, German, Korean, and Portuguese.

Do I need a portrait photo and an audio file?

Yes. Upload one portrait (front-facing, neutral expression works best) and one audio track (WAV or MP3).

Social

Talking Head with Studio Lighting

Professional talking-head podcast clip prompt with phoneme-level lip sync, three-point lighting, and Netflix-doc aesthetic.

Prompt verified: April 21, 2026·Curated by Seedance prompt team·Tested with Seedance 2.0 (public model)

Why this prompt works: 50mm at f/1.8 separates the host from bookshelf bokeh — the Netflix-documentary depth cue that signals "studio production" in the first frame.

When to Use This Style

Talking-head video with lip sync removes the single biggest production bottleneck in podcast and creator content: you need a camera, a quiet room, good lighting, and a reliable take every time you want to publish. With Seedance 2.0's Image-to-Video with Audio mode, you supply one portrait photograph and an audio recording, and the model synthesizes natural head movement, micro-expressions, and phoneme-accurate lip motion matched to the speech. The Netflix-documentary aesthetic in this prompt — 50mm prime, three-point lighting, dark teal background with bokeh bookshelf — is deliberately generic enough to work for a wide range of professional contexts: interview clips, LinkedIn thought leadership, course intro videos, and podcast audiograms.

For best results, upload a front-facing portrait with a neutral or slight smile expression, even and soft lighting, and a clear view of the mouth. Avoid sunglasses, heavy shadows across the face, or extreme head angles. Your audio file should be a clean WAV or high-quality MP3 with minimal background noise — the lip sync quality scales directly with audio clarity. At 10 seconds and 16:9, each generation covers a short statement or pull-quote, which is the most shareable unit of podcast content on LinkedIn and Twitter. Chain multiple generations with the same portrait to cover a longer script while maintaining visual consistency across clips.

The Prompt

Model

Seedance 2.0

Duration

10s

Aspect

16:9

Resolution

1080p

Mode

Image-to-Video with Audio

Professional podcast talking-head clip, medium close-up of a 30s male host in a modern studio, wearing a charcoal sweater, seated at a wooden desk with a broadcast microphone. Natural expressive delivery with subtle head movement, hand gestures entering frame occasionally, confident eye contact with camera. Three-point lighting: warm key from camera-left, soft fill from right, subtle rim light separating subject from dark teal background with soft bokeh bookshelf. Shot on 50mm prime at f/1.8, Netflix-documentary aesthetic. Phoneme-level lip sync to uploaded audio track, natural micro-expressions aligned to speech emphasis. 10 seconds, 16:9, 1080p, 24fps. Audio: input voice track, preserved.

Open in Generator →

How to Recreate This Video

1
Copy the full prompt above using the "Copy Prompt" button.
2
Open the Seedance 2.0 AI Video Generator.
3
Paste the prompt, set aspect ratio to 16:9 and duration to 10s, then click Generate.

Open Generator

Frequently Asked Questions

Which languages does lip sync support?: Seedance 2.0 supports phoneme-level lip sync for English, Chinese, Spanish, Japanese, French, German, Korean, and Portuguese.
Do I need a portrait photo and an audio file?: Yes. Upload one portrait (front-facing, neutral expression works best) and one audio track (WAV or MP3).

When to Use This Style

The Prompt

How to Recreate This Video

Frequently Asked Questions

Related Showcases