Gossip learning with linear models on fully distributed data

Rõbert Ormándi, István Hegedus, Márk Jelasity

Research output: Contribution to journalArticle

34 Citations (Scopus)


Machine learning over fully distributed data poses an important problem in peer-to-peer applications. In this model, we have one data record at each network node but without the possibility to move raw data because of privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is difficult because there is no possibility to learn local models; the system model offers almost no guarantee for reliability, yet the communication cost needs to be kept low. Here, we propose gossip learning, a generic approach that is based on multiple models taking random walks over the network in parallel, while applying an online learning algorithm to improve themselves, and getting combined via ensemble learning methods. We present an instantiation of this approach for the case of classification with linear models. Our main contribution is an ensemble learning method, which - through the continuous combination of the models in the network - implements a virtual weighted voting mechanism over an exponential number of models at practically no extra cost as compared with independent random walks. We prove the convergence of the method theoretically, and perform extensive experiments on benchmark data sets. Our experimental analysis demonstrates the performance and robustness of the proposed approach.

Original languageEnglish
Pages (from-to)556-571
Number of pages16
JournalConcurrency Computation Practice and Experience
Issue number4
Publication statusPublished - Jan 1 2013



  • P2P
  • bagging
  • gossip
  • online learning
  • random walk
  • stochastic gradient descent

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Computer Science Applications
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Cite this