Learning without Expert Labels for Multimodal Data

Maruf, Md Abdullah Al

Learning without Expert Labels for Multimodal Data

dc.contributor.author	Maruf, Md Abdullah Al	en
dc.contributor.committeechair	Karpatne, Anuj	en
dc.contributor.committeemember	Huang, Lifu	en
dc.contributor.committeemember	Chao, Wei-Lun	en
dc.contributor.committeemember	Lourentzou, Ismini	en
dc.contributor.committeemember	Murali, T. M.	en
dc.contributor.department	Computer Science and#38; Applications	en
dc.date.accessioned	2025-01-10T09:01:33Z	en
dc.date.available	2025-01-10T09:01:33Z	en
dc.date.issued	2025-01-09	en
dc.description.abstract	While advancements in deep learning have been largely possible due to the availability of large-scale labeled datasets, obtaining labeled datasets at the required granularity is challenging in many real-world applications, especially in scientific domains, due to the costly and labor-intensive nature of generating annotations. Hence, there is a need to develop new paradigms for learning that do not rely on expert-labeled data and can work even with indirect supervision. Approaches for learning with indirect supervision include unsupervised learning, self-supervised learning, weakly supervised learning, few-shot learning, and knowledge distillation. This thesis addresses these opportunities in the context of multi-modal data through three main contributions. First, this thesis proposes a novel Distance-aware Negative Sampling method for self-supervised Graph Representation Learning (GRL) that learns node representations directly from the graph structure by maximizing separation between distant nodes and maximizing cohesion among nearby nodes. Second, this thesis introduces effective modifications to weakly supervised semantic segmentation (WS3) models, such as stochastic aggregation to saliency maps that improve the learning of pseudo-ground truths from class-level coarse-grained labels and address the limitations of class activation maps. Finally, this thesis evaluates whether pre-trained Vision-Language Models (VLMs) contain the necessary scientific knowledge to identify and reason about biological traits from scientific images. The zero-shot performance of 12 large VLMs is evaluated on a novel VLM4Bio dataset, along with the effects of prompting and reasoning hallucinations are explored.	en
dc.description.abstractgeneral	While advancements in machine learning (ML), such as deep learning, have been largely possible due to the availability of large-scale labeled datasets, obtaining high-quality and high-resolution labels is challenging in many real-world applications due to the costly and labor-intensive nature of generating annotations. This thesis explores new ways of training ML models without relying heavily on expert-labeled data using indirect supervision. First, it introduces a novel way of using the structure of graphs for learning representations of graph-based data. Second, it analyzes the effect of weak supervision using coarse labels for image-based data. Third, it evaluates whether current ML models can recognize and reason about scientific images on their own, aiming to make learning more efficient and less dependent on exhaustive labeling.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:42306	en
dc.identifier.uri	https://hdl.handle.net/10919/124087	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Deep Learning	en
dc.subject	Knowledge-Guided Machine Learning	en
dc.subject	Weak Supervision	en
dc.subject	Self-Supervision	en
dc.subject	Vision-Language Models	en
dc.title	Learning without Expert Labels for Multimodal Data	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Maruf_M_D_2025.pdf
Size:: 23.34 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations