High-Quality Dataset-Sharing and Trade Based on A Performance-Oriented Directed Graph Neural Network
Files
TR Number
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The advancement of Artificial Intelligence (AI) models heavily relies on large high-quality datasets. However, in advanced manufacturing, collecting such data is time-consuming and labor-intensive for a single enterprise. Hence, it is important to establish a context-aware and privacy-preserving data sharing system to share small-but-high-quality datasets between trusted stakeholders. Existing data sharing approaches have explored privacy-preserving data distillation methods and focused on valuating individual samples tied to a specific AI model, limiting their flexibility across data modalities, AI tasks, and dataset ownership. In this work, we propose a performance-oriented representation learning (PORL) framework in a Directed Graph Neural Network (DiGNN). PORL distills raw datasets into privacy-preserving proxy datasets for sharing and learns compact meta data representations for each stakeholder locally. The meta data will then be used in DiGNN to forecast the AI model performance and guide the sharing via graph-level supervised learning. The effectiveness of the PORL-DiGNN is validated by two case studies: data sharing in the semiconducting manufacturing network between similar processes to create similar quality defect models; and data sharing in the design and manufacturing network of Microbial Fuel Cell anodes between upstream (design) and downstream (Additive Manufacturing) stages to create distinct but related AI models.