Browsing by Author "Wahed, Muntasir"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- MArBLE: Hierarchical Multi-Armed Bandits for Human-in-the-Loop Set ExpansionWahed, Muntasir; Gruhl, Daniel; Lourentzou, Ismini (ACM, 2023-10-21)The modern-day research community has an embarrassment of riches regarding pre-trained AI models. Even for a simple task such as lexicon set expansion, where an AI model suggests new entities to add to a predefined seed set of entities, thousands of models are available. However, deciding which model to use for a given set expansion task is non-trivial. In hindsight, some models can be ‘off topic’ for specific set expansion tasks, while others might work well initially but quickly exhaust what they have to offer. Additionally, certain models may require more careful priming in the form of samples or feedback before being fine-tuned to the task at hand. In this work, we frame this model selection as a sequential non-stationary problem, where there exist a large number of diverse pre-trained models that may or may not fit a task at hand, and an expert is shown one suggestion at a time to include in the set or not, i.e., accept or reject the suggestion. The goal is to expand the list with the most entities as quickly as possible. We introduce MArBLE, a hierarchical multi-armed bandit method for this task, and two strategies designed to address cold-start problems. Experimental results on three set expansion tasks demonstrate MArBLE’s effectiveness compared to baselines.
- A Task-Driven Privacy-Preserving Data-Sharing Framework for the Industrial InternetShojaee, Parshin; Zeng, Yingyan; Wahed, Muntasir; Seth, Avi; Jin, Ran; Lourentzou, Ismini (2023-01)Industrial Internet provides a collaborative computational platform for participating enterprises, allowing the collection of big data for machine learning tasks. Despite the promise of training and deployment acceleration, and the potential to optimize decision-making processes through data-sharing, the adoption of such technologies is impacted by the increasing concerns about information privacy. As enterprises prefer to keep data private, this limits interoperability. While prior work has largely explored privacy-preserving mechanisms, the proposed methods naively average or randomly sample data shared from all participants instead of selecting the most well-suited subsets for a particular downstream learning task. Motivated by the lack of effective data-sharing mechanisms for heterogeneous machine learning tasks in Industrial Internet, we propose PriED, a task-driven data-sharing framework that selectively fuses shared data and local data from participants to improve supervised learning performance. PriED utilizes privacy-preserving data distillation to facilitate data exchange, and dynamic data selection to optimize downstream machine learning tasks. We demonstrate performance improvements on a real semiconductor manufacturing case study.