Dani, Riya Jinesh2022-06-102022-06-102022-06-09vt_gsexam:34693http://hdl.handle.net/10919/110590Zero-shot video generation involves generating videos of concepts (action classes) that are not seen in the training phase. Even though the research community has explored conditional video generation for long high-resolution videos, zero-shot video remains a fairly unexplored and challenging task. Most recent works can generate videos for action-object or motion-content pairs, where both the object (content) and action (motion) are observed separately during training, yet results often lack spatial consistency between foreground and background and cannot generalize to complex scenes with multiple objects or actions. In this work, we propose Concept2Vid that generates zero-shot videos for classes that are completely unseen during training. In contrast to prior work, our model is not limited to a predefined fixed set of class-level attributes, but rather utilizes semantic information from multiple videos of the same topic to generate samples from novel classes. We evaluate qualitatively and quantitatively on the Kinetics400 and UCF101 datasets, demonstrating the effectiveness of our proposed model.ETDenIn CopyrightComputing methodologiesComputer vision tasksVision for roboticsNeural networksInformation systemsMultimedia content creationConcept Vectors for Zero-Shot Video GenerationThesis