(Private) Kernelized Bandits with Distributed Biased Feedback

dc.contributor.authorLi, Fengjiaoen
dc.contributor.authorZhou, Xingyuen
dc.contributor.authorJi, Boen
dc.date.accessioned2023-04-04T17:37:50Zen
dc.date.available2023-04-04T17:37:50Zen
dc.date.issued2023-02-28en
dc.date.updated2023-04-01T07:53:14Zen
dc.description.abstractIn this paper, we study kernelized bandits with distributed biased feedback. This problem is motivated by several real-world applications (such as dynamic pricing, cellular network configuration, and policy making), where users from a large population contribute to the reward of the action chosen by a central entity, but it is difficult to collect feedback from all users. Instead, only biased feedback (due to user heterogeneity) from a subset of users may be available. In addition to such partial biased feedback, we are also faced with two practical challenges due to communication cost and computation complexity. To tackle these challenges, we carefully design a new \emph{distributed phase-then-batch-based elimination (DPBE)} algorithm, which samples users in phases for collecting feedback to reduce the bias and employs \emph{maximum variance reduction} to select actions in batches within each phase. By properly choosing the phase length, the batch size, and the confidence width used for eliminating suboptimal actions, we show that DPBE achieves a sublinear regret of $\tilde{O}(T^{1-\alpha/2}+\sqrt{\gamma_T T})$, where $\alpha\in (0,1)$ is the user-sampling parameter one can tune. Moreover, DPBE can significantly reduce both communication cost and computation complexity in distributed kernelized bandits, compared to some variants of the state-of-the-art algorithms (originally developed for standard kernelized bandits). Furthermore, by incorporating various \emph{differential privacy} models (including the central, local, and shuffle models), we generalize DPBE to provide privacy guarantees for users participating in the distributed learning process. Finally, we conduct extensive simulations to validate our theoretical results and evaluate the empirical performance.en
dc.description.versionPublished versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1145/3579318en
dc.identifier.urihttp://hdl.handle.net/10919/114254en
dc.language.isoenen
dc.publisherACMen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.holderThe author(s)en
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.title(Private) Kernelized Bandits with Distributed Biased Feedbacken
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
3579318.pdf
Size:
2.21 MB
Format:
Adobe Portable Document Format
Description:
Published version
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: