TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

1 code implementation7 Oct 2024 Lijie Yang, Zhihao Zhang, Zhuofu Chen, Zikun Li, Zhihao Jia

Existing sparse attention mechanisms designed to address this bottleneck have two limitations: (1) they often fail to reliably identify the most relevant tokens for attention, and (2) they overlook the spatial coherence of token selection across consecutive Transformer layers, which can lead to performance degradation and substantial overhead in token selection.


Quarl: A Learning-Based Quantum Circuit Optimizer

no code implementations17 Jul 2023 Zikun Li, Jinjun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded Padon, Zhihao Jia

Applying reinforcement learning (RL) to quantum circuit optimization raises two main challenges: the large and varying action space and the non-uniform state representation.

