Search Results for author: Zhichen Zeng

Found 20 papers, 8 papers with code

CATS: Mitigating Correlation Shift for Multivariate Time Series Classification

no code implementations5 Apr 2025 Xiao Lin, Zhichen Zeng, Tianxin Wei, Zhining Liu, Yuzhong Chen, Hanghang Tong

Given the prevalence of multivariate time series (MTS) data across various domains, the UDA task for MTS classification has emerged as a critical challenge.

Graph Attention Time Series +2

ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration

1 code implementation10 Mar 2025 Mengting Ai, Tianxin Wei, Yifan Chen, Zhichen Zeng, Ritchie Zhao, Girish Varatkar, Bita Darvish Rouhani, Xianfeng Tang, Hanghang Tong, Jingrui He

Mixture-of-Experts (MoE) Transformer, the backbone architecture of multiple phenomenal language models, leverages sparsity by activating only a fraction of model parameters for each input token.

Mixture-of-Experts

Joint Optimal Transport and Embedding for Network Alignment

1 code implementation26 Feb 2025 Qi Yu, Zhichen Zeng, Yuchen Yan, Lei Ying, R. Srikant, Hanghang Tong

For one thing (OT for embedding), through a simple yet effective transformation, the noise-reduced OT mapping serves as an adaptive sampling strategy directly modeling all cross-network node pairs for robust embedding learning. For another (embedding for OT), on top of the learned embeddings, the OT cost can be gradually trained in an end-to-end fashion, which further enhances the alignment quality.

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent

no code implementations24 Feb 2025 Yuheng Zhang, Dian Yu, Tao Ge, Linfeng Song, Zhichen Zeng, Haitao Mi, Nan Jiang, Dong Yu

Reinforcement learning from human feedback (RLHF) has demonstrated remarkable effectiveness in aligning large language models (LLMs) with human preferences.

External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation

no code implementations20 Feb 2025 Mingfu Liang, Xi Liu, Rong Jin, Boyang Liu, Qiuling Suo, Qinghai Zhou, Song Zhou, Laming Chen, Hua Zheng, Zhiyuan Li, Shali Jiang, Jiyan Yang, Xiaozhen Xia, Fan Yang, Yasmine Badr, Ellie Wen, Shuyu Xu, Hansey Chen, Zhengyu Zhang, Jade Nie, Chunzhi Yang, Zhichen Zeng, Weilin Zhang, Xingliang Huang, Qianru Li, Shiquan Wang, Evelyn Lyu, Wenjing Lu, Rui Zhang, Wenjun Wang, Jason Rudy, Mengyue Hang, Kai Wang, Yinbin Ma, Shuaiwen Wang, Sihan Zeng, Tongyi Tang, Xiaohan Wei, Longhao Jin, Jamey Zhang, Marcus Chen, Jiayi Zhang, Angie Huang, Chi Zhang, Zhengli Zhao, Jared Yang, Qiang Jin, Xian Chen, Amit Anand Amlesahwaram, Lexi Song, Liang Luo, Yuchen Hao, Nan Xiao, Yavuz Yetim, Luoshang Pan, Gaoxiang Liu, Yuxi Hu, Yuzhen Huang, Jackie Xu, Rich Zhu, Xin Zhang, Yiqun Liu, Hang Yin, Yuxin Chen, Buyun Zhang, Xiaoyi Liu, Xingyuan Wang, Wenguang Mao, Zhijing Li, Zhehui Zhou, Feifan Gu, Qin Huang, Chonglin Sun, Nancy Yu, Shuo Gu, Shupin Mao, Benjamin Au, Jingzheng Qin, Peggy Yao, Jae-Woo Choi, Bin Gao, Ernest Wang, Lei Zhang, Wen-Yen Chen, Ted Lee, Jay Zha, Yi Meng, Alex Gong, Edison Gao, Alireza Vahdatpour, Yiping Han, Yantao Yao, Toshinari Kureha, Shuo Chang, Musharaf Sultan, John Bocharov, Sagar Chordia, Xiaorui Gan, Peng Sun, Rocky Liu, Bo Long, Wenlin Chen, Santanu Kolay, Huayu Li

Second, large-volume data arrive in a streaming mode with data distributions dynamically shifting, as new users/ads join and existing users/ads leave the system.

Data Augmentation

Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs

no code implementations17 Feb 2025 Kan Zhu, Tian Tang, Qinyu Xu, Yile Gu, Zhichen Zeng, Rohan Kadekodi, Liangyu Zhao, Ang Li, Arvind Krishnamurthy, Baris Kasikci

Long-context models are essential for many applications but face inefficiencies in loading large KV caches during decoding.

THeGCN: Temporal Heterophilic Graph Convolutional Network

no code implementations21 Dec 2024 Yuchen Yan, Yuzhong Chen, Huiyuan Chen, Xiaoting Li, Zhe Xu, Zhichen Zeng, Lihui Liu, Zhining Liu, Hanghang Tong

Furthermore, we highlight that the edge heterophily issue and the temporal heterophily issue often co-exist in event-based continuous graphs, giving rise to the temporal edge heterophily challenge.

Graph Learning

A Collaborative Ensemble Framework for CTR Prediction

no code implementations20 Nov 2024 Xiaolong Liu, Zhichen Zeng, Xiaoyi Liu, Siyang Yuan, Weinan Song, Mengyue Hang, Yiqun Liu, Chaofei Yang, Donghyun Kim, Wen-Yen Chen, Jiyan Yang, Yiping Han, Rong Jin, Bo Long, Hanghang Tong, Philip S. Yu

Recent advances in foundation models have established scaling laws that enable the development of larger models to achieve enhanced performance, motivating extensive research into large-scale recommendation models.

Click-Through Rate Prediction Negation +2

AdaRC: Mitigating Graph Structure Shifts during Test-Time

no code implementations9 Oct 2024 Wenxuan Bao, Zhichen Zeng, Zhining Liu, Hanghang Tong, Jingrui He

However, existing TTA algorithms are primarily designed for attribute shifts in vision tasks, where samples are independent.

Attribute Test-time Adaptation

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

no code implementations12 Aug 2024 Zhiwen Mo, Lei Wang, Jianyu Wei, Zhichen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang

Then, LUT Tensor Core proposes the hardware design featuring an elongated tiling shape design to enhance table reuse and a bit-serial design to support various precision combinations in mpGEMM.

Language Modelling Large Language Model

AIM: Attributing, Interpreting, Mitigating Data Unfairness

1 code implementation13 Jun 2024 Zhining Liu, Ruizhong Qiu, Zhichen Zeng, Yada Zhu, Hendrik Hamann, Hanghang Tong

Data collected in the real world often encapsulates historical discrimination against disadvantaged groups and individuals.

Fairness

Discrete-state Continuous-time Diffusion for Graph Generation

1 code implementation19 May 2024 Zhe Xu, Ruizhong Qiu, Yuzhong Chen, Huiyuan Chen, Xiran Fan, Menghai Pan, Zhichen Zeng, Mahashweta Das, Hanghang Tong

Graph is a prevalent discrete data structure, whose generation has wide applications such as drug discovery and circuit design.

Drug Discovery Graph Generation

Allo: A Programming Model for Composable Accelerator Design

2 code implementations7 Apr 2024 Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang

For the GPT2 model, the inference latency of the Allo generated accelerator is 1. 7x faster than the NVIDIA A100 GPU with 5. 4x higher energy efficiency, demonstrating the capability of Allo to handle large-scale designs.

High-Level Synthesis model

Hierarchical Multi-Marginal Optimal Transport for Network Alignment

no code implementations6 Oct 2023 Zhichen Zeng, Boxin Du, Si Zhang, Yinglong Xia, Zhining Liu, Hanghang Tong

To depict high-order relationships across multiple networks, the FGW distance is generalized to the multi-marginal setting, based on which networks can be aligned jointly.

Ensuring User-side Fairness in Dynamic Recommender Systems

no code implementations29 Aug 2023 Hyunsik Yoo, Zhichen Zeng, Jian Kang, Ruizhong Qiu, David Zhou, Zhining Liu, Fei Wang, Charlie Xu, Eunice Chan, Hanghang Tong

In the ever-evolving landscape of user-item interactions, continual adaptation to newly collected data is crucial for recommender systems to stay aligned with the latest user preferences.

Fairness Recommendation Systems +1

Cannot find the paper you are looking for? You can Submit a new open access paper.