η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities

Published in International Symposium on Computer Architecture (ISCA), 2021

Primary contributor to the implementation and evaluation (resource/energy/latency estimation) of the proposed hardware accelerator. Acted as the primary designer of the Omni-PE (Processing Element).

Recommended citation: Xingyao Zhang, Haojun Xia, Donglin Zhuang, Hao Sun, Xin Fu, Michael Taylor, Shuaiwen Leon Song. "η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities." In Proceedings of International Symposium on Computer Architecture (ISCA), 2021.