Privacy and security are among the highest priorities in data mining approaches over data collected from mobile devices. Fully distributed machine learning is a promising direction in this context. However, it is a hard problem to design protocols that are efficient yet provide sufficient levels of privacy and security. In fully distributed environments, secure multiparty computation (MPC) is often applied to solve these problems. However, in our dynamic and unreliable application domain, known MPC algorithms are not scalable or not robust enough. We propose a light-weight protocol to quickly and securely compute the sum query over a subset of participants assuming a semihonest adversary. During the computation the participants learn no individual values. We apply this protocol to efficiently calculate the sum of gradients as part of a fully distributed minibatch stochastic gradient descent algorithm. The protocol achieves scalability and robustness by exploiting the fact that in this application domain a "quick and dirty" sum computation is acceptable. We utilize the Paillier homomorphic cryptosystem as part of our solution combined with extreme lossy gradient compression to make the cost of the cryptographic algorithms affordable. We demonstrate both theoretically and experimentally, based on churn statistics from a real smartphone trace, that the protocol is indeed practically viable.
ASJC Scopus subject areas
- Information Systems
- Computer Networks and Communications