Distributed Matrix Multiplication with MDS Array BP-XOR Codes for Scaling Clusters
Loading...

Date
2019
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
This study presents a novel coded computation technique for distributed matrix-matrix product computation at a massive scale that outperforms well known previous strategies in terms of total execution time. Our method achieves this performance by distributing the encoding operation over the cluster (slave) nodes at the expense of increased master-slave communication. The product computation is performed using MDS array Belief Propagation (BP)-decodable codes based on pure XOR operations. In addition, our scheme is configurable and suited for modern compute node architectures equipped with multiple processing units organized in a hierarchical manner. Assuming the number of backup nodes being sublinear in the size of the product, we shall demonstrate that the proposed scheme achieves order-optimal computation from an end-to-end latency perspective while ensuring acceptable communication requirements that can be addressed by today's high speed network link infrastructures.
Description
ORCID
Keywords
Fields of Science
Citation
WoS Q
Scopus Q
Source
IEEE International Symposium on Information Theory (ISIT) -- JUL 07-12, 2019 -- Paris, FRANCE
Volume
Issue
Start Page
1792
End Page
1796
