Although our method is quite simple, it produces surprisingly compelling results (see video). We predict two consecutive frames for temporally coherent video results and introduce a separate pipeline for realistic face synthesis. To transfer the motion, we extract poses from the source subject and apply the learned pose-to-appearance mapping to generate the target subject. We approach this problem as video-to-video translation using pose as an intermediate representation. This paper presents a simple method for "do as I do" motion transfer: given a source video of a person dancing, we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves.
0 Comments
Leave a Reply. |