Data-Efficient Learning in Image Synthesis and Instance Segmentation

dc.contributor.authorRobb, Esther Anneen
dc.contributor.committeechairHuang, Jia-Binen
dc.contributor.committeememberEldardiry, Hodaen
dc.contributor.committeememberJia, Ruoxien
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2021-08-19T08:00:13Zen
dc.date.available2021-08-19T08:00:13Zen
dc.date.issued2021-08-18en
dc.description.abstractModern deep learning methods have achieve remarkable performance on a variety of computer vision tasks, but frequently require large, well-balanced training datasets to achieve high-quality results. Data-efficient performance is critical for downstream tasks such as automated driving or facial recognition. We propose two methods of data-efficient learning for the tasks of image synthesis and instance segmentation. We first propose a method of high-quality and diverse image generation from finetuning to only 5-100 images. Our method factors a pretrained model into a small but highly expressive weight space for finetuning, which discourages overfitting in a small training set. We validate our method in a challenging few-shot setting of 5-100 images in the target domain. We show that our method has significant visual quality gains compared with existing GAN adaptation methods. Next, we introduce a simple adaptive instance segmentation loss which achieves state-of-the-art results on the LVIS dataset. We demonstrate that the rare categories are heavily suppressed by textit{correct background predictions}, which reduces the probability for all foreground categories with equal weight. Due to the relative infrequency of rare categories, this leads to an imbalance that biases towards predicting more frequent categories. Based on this insight, we develop DropLoss -- a novel adaptive loss to compensate for this imbalance without a trade-off between rare and frequent categories.en
dc.description.abstractgeneralMany of the impressive results seen in modern computer vision rely on learning patterns from huge datasets of images, but these datasets may be expensive or difficult to collect. Many applications of computer vision need to learn from a very small number of examples, such as learning to recognize an unusual traffic event and behave safely in a self-driving car. In this thesis we propose two methods of learning from only a few examples. Our first method generates novel, high-quality and diverse images using a model fine-tuned on only 5-100 images. We start with an image generation model that was trained a much larger image set (70K images), and adapts it to a smaller image set (5-100 images). We selectively train only part of the network to encourage diversity and prevent memorization. Our second method focuses on the instance segmentation setting, where the model predicts (1) what objects occur in an image and (2) their exact outline in the image. This setting commonly suffers from long-tail distributions, where some of the known objects occur frequently (e.g. "human" may occur 1000+ times) but most only occur a few times (e.g. "cake" or "parrot" may only occur 10 times). We observed that the "background" label has a disproportionate effect of suppressing the rare object labels. We use this to develop a method to balance suppression from background classes during training.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:32073en
dc.identifier.urihttp://hdl.handle.net/10919/104676en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectComputer visionen
dc.subjectdata-efficient learningen
dc.titleData-Efficient Learning in Image Synthesis and Instance Segmentationen
dc.typeThesisen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Robb_EA_T_2021.pdf
Size:
17.98 MB
Format:
Adobe Portable Document Format

Collections