Recovery-oriented STAIR Codes for Storage Clusters

Introduction

Storage clusters for managing immutable big data are susceptible to transient and permanent failures at both node and rack levels, and have increasingly employed erasure codes to achieve data availability. However, most existing erasure codes not only storage-inefficiently use an entire rack of parity information to tolerate a partial rack of node failures, but also cause the recovery of the common single-node failures to consume a significant amount of oversubscribed cross-rack bandwidth.

To relieve both storage and recovery burdens of erasure codes, this project studies a new family of codes called recovery-oriented STAIR (R-STAIR) codes, which not only provide storage-efficient fault tolerance for mixed node and rack failures, but also achieve rack-local recovery for the common single-node failures. R-STAIR codes augment our previously proposed STAIR codes for a storage cluster setting. We demonstrate the usability of R-STAIR codes by implementing them in a practical Hadoop cluster.

Publications

Mingqiang Li, Runhui Li, Patrick P. C. Lee
"Relieving Both Storage and Recovery Burdens in Big Data Clusters with R-STAIR Codes."
Poster presentation: USENIX Annual Technical Conference (ATC'15), July 2015.

Runhui Li, Yuchong Hu, and Patrick P. C. Lee.
"Enabling Efficient and Reliable Transition from Replication to Erasure Coding for Clustered File Systems."
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2015) (Regular paper), Rio de Janeiro, Brazil, June 2015.
[pdf] [pptx] [software]

Mingqiang Li and Patrick P. C. Lee.
"STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems."
Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST '14), Santa Calra, CA, February 2014.
(Acceptance rate: 24/133 = 18%)
[pdf] [slides] [poster]

Download

R-STAIR code implementation on Hadoop: RSTAIR-Hadoop-1.0.0.zip

People

This project is done by Advanced Network and System Research Laboratory in the Department of Computer Science and Engineering at the Chinese University of Hong Kong (CUHK).

Runhui Li (PhD)
Mingqiang Li (former Postdoc)
Patrick P. C. Lee (Faculty)

Acknowledgments

The work is supported by grants AoE/E-02/08 and ECS CUHK419212 from the University Grants Committee of Hong Kong