Unsupervised Domain Adaptation for Human Movement

We present a novel and effective approach for video to video translation through generative adversarial networks. Given an input video of a person and a target video, our model generates a video of that same person mimicking the motion of the person in the target video. Our model can be used for various tasks, from ”teaching” someone to dance like a skilled dancer to the generation of realistic videos of celebrities performing our biddings. Motion transfer between people in the video to video and image to image translation problems is a challenging task. Existing implementations perform well when they only must transfer the style of one domain to another, but they fail to do so when they must handle geometrical and structural changes between the domains, specifically when it comes to human images. Our approach aims to tackle this obstacle by introducing a skeleton consistency loss to the training process in order to preserve the human body structural information during the transformation between the domains. Our results are shown as proof of concept for our approach in which we demonstrate our superiority over the existing implementations in the task of human pose transferring.