Totoro: A Scalable Federated Learning Engine for the Edge

dc.contributor.authorChing, Cheng-Weien
dc.contributor.authorChen, Xinen
dc.contributor.authorKim, Taehwanen
dc.contributor.authorJi, Boen
dc.contributor.authorWang, Qingyangen
dc.contributor.authorDa Silva, Dilmaen
dc.contributor.authorHu, Litingen
dc.date.accessioned2024-05-02T12:35:03Zen
dc.date.available2024-05-02T12:35:03Zen
dc.date.issued2024-04-22en
dc.date.updated2024-05-01T07:49:14Zen
dc.description.abstractFederated Learning (FL) is an emerging distributed machine learning (ML) technique that enables in-situ model training and inference on decentralized edge devices. We propose Totoro, a novel scalable FL engine, that enables massive FL applications to run simultaneously on edge networks. The key insight is to explore a distributed hash table (DHT)-based peer-to-peer (P2P) model to re-architect the centralized FL system design into a fully decentralized one. In contrast to previous studies where many FL applications shared one centralized parameter server, Totoro assigns a dedicated parameter server to each individual application. Any edge node can act as any application’s coordinator, aggregator, client selector, worker (participant device), or any combination of the above, thereby radically improving scalability and adaptivity. Totoro introduces three innovations to realize its design: a locality-aware P2P multi-ring structure, a publish/subscribebased forest abstraction, and a bandit-based exploitationexploration path planning model. Real-world experiments on 500 Amazon EC2 servers show that Totoro scales gracefully with the number of FL applications and 𝑁 edge nodes, speeds up the total training time by 1.2 × −14.0×, achieves 𝑂 (𝑙𝑜𝑔𝑁 ) hops for model dissemination and gradient aggregation with millions of nodes, and efficiently adapts to the practical edge networks and churns.en
dc.description.versionPublished versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1145/3627703.3629575en
dc.identifier.urihttps://hdl.handle.net/10919/118732en
dc.language.isoenen
dc.publisherACMen
dc.rightsCreative Commons Attribution-ShareAlike 4.0 Internationalen
dc.rights.holderThe author(s)en
dc.rights.urihttp://creativecommons.org/licenses/by-sa/4.0/en
dc.titleTotoro: A Scalable Federated Learning Engine for the Edgeen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
3627703.3629575.pdf
Size:
2.64 MB
Format:
Adobe Portable Document Format
Description:
Published version
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: