Peer assisted distributed architecture for training machine learning models
Abstract:
Techniques for distributing the training of machine learning models across a plurality of computing devices are presented. An example method includes receiving, from a computing device in a distributed computing environment, a request for a set of outstanding jobs for training part of a machine learning model. A system transmits, to the computing device, information identifying the set of outstanding jobs. The system receives, from the computing device, a selected job for execution on the computing device from the set of outstanding jobs. A chunk of training data associated with the selected job and one or more parameters associated with the selected job may be transmitted to the computing device, and the system may take one or more actions with respect to the chunk of data associated with the selected job based on a response from the computing device.
Information query
Patent Agency Ranking
0/0