Talking Head with Studio Lighting
Professional talking-head podcast clip prompt with phoneme-level lip sync, three-point lighting, and Netflix-doc aesthetic.
Why this prompt works: 50mm at f/1.8 separates the host from bookshelf bokeh — the Netflix-documentary depth cue that signals "studio production" in the first frame.

When to Use This Style
Talking-head video with lip sync removes the single biggest production bottleneck in podcast and creator content: you need a camera, a quiet room, good lighting, and a reliable take every time you want to publish. With Seedance 2.0's Image-to-Video with Audio mode, you supply one portrait photograph and an audio recording, and the model synthesizes natural head movement, micro-expressions, and phoneme-accurate lip motion matched to the speech. The Netflix-documentary aesthetic in this prompt — 50mm prime, three-point lighting, dark teal background with bokeh bookshelf — is deliberately generic enough to work for a wide range of professional contexts: interview clips, LinkedIn thought leadership, course intro videos, and podcast audiograms.
For best results, upload a front-facing portrait with a neutral or slight smile expression, even and soft lighting, and a clear view of the mouth. Avoid sunglasses, heavy shadows across the face, or extreme head angles. Your audio file should be a clean WAV or high-quality MP3 with minimal background noise — the lip sync quality scales directly with audio clarity. At 10 seconds and 16:9, each generation covers a short statement or pull-quote, which is the most shareable unit of podcast content on LinkedIn and Twitter. Chain multiple generations with the same portrait to cover a longer script while maintaining visual consistency across clips.
The Prompt
Seedance 2.0
10s
16:9
1080p
Image-to-Video with Audio
Professional podcast talking-head clip, medium close-up of a 30s male host in a modern studio, wearing a charcoal sweater, seated at a wooden desk with a broadcast microphone. Natural expressive delivery with subtle head movement, hand gestures entering frame occasionally, confident eye contact with camera. Three-point lighting: warm key from camera-left, soft fill from right, subtle rim light separating subject from dark teal background with soft bokeh bookshelf. Shot on 50mm prime at f/1.8, Netflix-documentary aesthetic. Phoneme-level lip sync to uploaded audio track, natural micro-expressions aligned to speech emphasis. 10 seconds, 16:9, 1080p, 24fps. Audio: input voice track, preserved.
How to Recreate This Video
- 1
Copy the full prompt above using the "Copy Prompt" button.
- 2
Open the Seedance 2.0 AI Video Generator.
- 3
Paste the prompt, set aspect ratio to 16:9 and duration to 10s, then click Generate.
Frequently Asked Questions
- Which languages does lip sync support?
- Seedance 2.0 supports phoneme-level lip sync for English, Chinese, Spanish, Japanese, French, German, Korean, and Portuguese.
- Do I need a portrait photo and an audio file?
- Yes. Upload one portrait (front-facing, neutral expression works best) and one audio track (WAV or MP3).


