Performance benchmarking, analysis, and optimization of deep learning inference