doi dblp Reliability of Large Scale GPU Clusters for Deep Learning Workloads Junjie Qian | Taeyoon Kim | Myeongjae Jeon Companion of The Web Conference 2021, Virtual Event / Ljubljana, Slovenia, April 19-23, 2021