VTechWorks staff will be away for the winter holidays starting Tuesday, December 24, 2024, through Wednesday, January 1, 2025, and will not be replying to requests during this time. Thank you for your patience, and happy holidays!
 

Learning Consistent Visual Synthesis

dc.contributor.authorGao, Chenen
dc.contributor.committeechairHuang, Jia-Binen
dc.contributor.committeememberDhillon, Harpreet Singhen
dc.contributor.committeememberKopf, Johannes Peteren
dc.contributor.committeememberHuang, Berten
dc.contributor.committeememberAbbott, A. Lynnen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2022-08-23T08:00:27Zen
dc.date.available2022-08-23T08:00:27Zen
dc.date.issued2022-08-22en
dc.description.abstractWith the rapid development of photography, we can easily record the 3D world by taking photos and videos. In traditional images and videos, the viewer observes the scene from fixed viewpoints and cannot navigate the scene or edit the 2D observation afterward. Thus, visual content editing and synthesis become an essential task in computer vision. However, achieving high-quality visual synthesis often requires a complex and expensive multi-camera setup. This is not practical for daily use because most people only have one cellphone camera. But a single camera, on the contrary, could not provide enough multi-view constraints to synthesize consistent visual content. Therefore, in this thesis, I address this challenging single-camera visual synthesis problem by leveraging different regularizations. I study three consistent synthesis problems: time-consistent synthesis, view-consistent synthesis, and view-time-consistent synthesis. I show how we can take cellphone-captured monocular images and videos as input to model the scene and consistently synthesize new content for an immersive viewing experience.en
dc.description.abstractgeneralWith the rapid development of photography, we can easily record the 3D world by taking photos and videos. More recently, we have incredible cameras on cell phones, which enable us to take pro-level photos and videos. Those powerful cellphones even have advanced computational photography features build-in. However, these features focus on faithfully recording the world during capturing. We can only watch the photo and video as it is, but not navigate the scene, edit the 2D observation, or synthesize content afterward. Thus, visual content editing and synthesis become an essential task in computer vision. We know that achieving high-quality visual synthesis often requires a complex and expensive multi-camera setup. This is not practical for daily use because most people only have one cellphone camera. But a single camera, on the contrary, is not enough to synthesize consistent visual content. Therefore, in this thesis, I address this challenging single-camera visual synthesis problem by leveraging different regularizations. I study three consistent synthesis problems: time-consistent synthesis, view-consistent synthesis, and view-time-consistent synthesis. I show how we can take cellphone-captured monocular images and videos as input to model the scene and consistently synthesize new content for an immersive viewing experience.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:35023en
dc.identifier.urihttp://hdl.handle.net/10919/111588en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.subjectComputer visionen
dc.subjectComputational photographyen
dc.subjectView synthesisen
dc.subjectTemporal consistencyen
dc.titleLearning Consistent Visual Synthesisen
dc.typeDissertationen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gao_C_D_2022.pdf
Size:
43.88 MB
Format:
Adobe Portable Document Format