Abstract: Automatic Video Dubbing (AVD) generates speech aligned with lip motion and facial emotion from scripts. Recent research focuses on modeling multimodal context to enhance prosody ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results