no code implementations • 9 Jun 2025 • Yichuan Wang, Shu Liu, Zhifei Li, Yongji Wu, Ziming Mao, Yilong Zhao, Xiao Yan, Zhiying Xu, Yang Zhou, Ion Stoica, Sewon Min, Matei Zaharia, Joseph E. Gonzalez
Embedding-based search is widely used in applications such as recommendation and retrieval-augmented generation (RAG).
no code implementations • 26 Jan 2025 • An Yang, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoyan Huang, Jiandong Jiang, Jianhong Tu, Jianwei Zhang, Jingren Zhou, Junyang Lin, Kai Dang, Kexin Yang, Le Yu, Mei Li, Minmin Sun, Qin Zhu, Rui Men, Tao He, Weijia Xu, Wenbiao Yin, Wenyuan Yu, Xiafei Qiu, Xingzhang Ren, Xinlong Yang, Yong Li, Zhiying Xu, Zipeng Zhang
By leveraging our inference framework, the Qwen2. 5-1M models achieve a remarkable 3x to 7x prefill speedup in scenarios with 1 million tokens of context.
no code implementations • 30 Dec 2024 • Jiawei Zhou, Woojeong Kim, Zhiying Xu, Alexander M. Rush, Minlan Yu
Our presented NetFlowGen framework goes beyond a proof-of-concept for network traffic pre-training and addresses specific challenges such as unifying network feature representations, learning from large unlabeled traffic data volume, and testing on real downstream tasks in DDoS attack detection.
no code implementations • 2 Dec 2022 • Zhiying Xu, Hongding Peng, Wei Wang
Traditional deep learning compilers rely on heuristics for subgraph generation, which impose extra constraints on graph optimization, e. g., each subgraph can only contain at most one complex operator.
1 code implementation • 25 Oct 2022 • Zhiying Xu, Francis Y. Yan, Rachee Singh, Justin T. Chiu, Alexander M. Rush, Minlan Yu
The rapid expansion of global cloud wide-area networks (WANs) has posed a challenge for commercial optimization engines to efficiently solve network traffic engineering (TE) problems at scale.
no code implementations • 22 Oct 2022 • Zhiying Xu, Jiafan Xu, Hongding Peng, Wei Wang, Xiaoliang Wang, Haoran Wan, Haipeng Dai, Yixu Xu, Hao Cheng, Kun Wang, Guihai Chen
Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware.
2 code implementations • 13 Mar 2020 • Jiawei Zhou, Zhiying Xu, Alexander M. Rush, Minlan Yu
Botnets are now a major source for many network attacks, such as DDoS attacks and spam.
no code implementations • 19 Dec 2019 • Zhiying Xu, Shuyu Shi, Alex X. Liu, Jun Zhao, Lin Chen
ADADP significantly reduces the privacy cost by improving the convergence speed with an adaptive learning rate and mitigates the negative effect of differential privacy upon the model accuracy by introducing adaptive noise.
no code implementations • 27 Nov 2019 • Jun Zhao, Teng Wang, Tao Bai, Kwok-Yan Lam, Zhiying Xu, Shuyu Shi, Xuebin Ren, Xinyu Yang, Yang Liu, Han Yu
Although both classical Gaussian mechanisms [1, 2] assume $0 < \epsilon \leq 1$, our review finds that many studies in the literature have used the classical Gaussian mechanisms under values of $\epsilon$ and $\delta$ where the added noise amounts of [1, 2] do not achieve $(\epsilon,\delta)$-DP.