High-Quality Dataset-Sharing and Trade Based on A Performance-Oriented Directed Graph Neural Network

TR Number

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The advancement of Artificial Intelligence (AI) models heavily relies on large high-quality datasets. However, in advanced manufacturing, collecting such data is time-consuming and labor-intensive for a single enterprise. Hence, it is important to establish a context-aware and privacy-preserving data sharing system to share small-but-high-quality datasets between trusted stakeholders. Existing data sharing approaches have explored privacy-preserving data distillation methods and focused on valuating individual samples tied to a specific AI model, limiting their flexibility across data modalities, AI tasks, and dataset ownership. In this work, we propose a performance-oriented representation learning (PORL) framework in a Directed Graph Neural Network (DiGNN). PORL distills raw datasets into privacy-preserving proxy datasets for sharing and learns compact meta data representations for each stakeholder locally. The meta data will then be used in DiGNN to forecast the AI model performance and guide the sharing via graph-level supervised learning. The effectiveness of the PORL-DiGNN is validated by two case studies: data sharing in the semiconducting manufacturing network between similar processes to create similar quality defect models; and data sharing in the design and manufacturing network of Microbial Fuel Cell anodes between upstream (design) and downstream (Additive Manufacturing) stages to create distinct but related AI models.

Description

Keywords

Citation