Search Results for author: Xin Liu

Found 425 papers, 184 papers with code

"What's Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets

1 code implementation26 Jun 2025 Akshay Paruchuri, Maryam Aziz, Rohit Vartak, Ayman Ali, Best Uchehara, Xin Liu, Ishan Chatterjee, Monica Agrawal

People are increasingly seeking healthcare information from large language models (LLMs) via interactive chatbots, yet the nature and inherent risks of these conversations remain largely unexplored.

FixCLR: Negative-Class Contrastive Learning for Semi-Supervised Domain Generalization

no code implementations25 Jun 2025 Ha Min Son, Shahbaz Rezaei, Xin Liu

These experiments include benchmarking different improvements to semi-supervised methods, evaluating the performance of pretrained versus non-pretrained models, and testing on datasets with many domains.

Benchmarking Contrastive Learning +3

SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding

no code implementations12 Jun 2025 Ziyi Zhang, Ziheng Jiang, Chengquan Jiang, Menghan Yu, Size Zheng, Haibin Lin, Henry Hoffmann, Xin Liu

Low-latency decoding for large language models (LLMs) is crucial for applications like chatbots and code assistants, yet generating long outputs remains slow in single-query settings.

Non-Contact Health Monitoring During Daily Personal Care Routines

1 code implementation11 Jun 2025 Xulin Ma, Jiankai Tang, Zhang Jiang, Songqin Cheng, Yuanchun Shi, Dong Li, Xin Liu, Daniel McDuff, Xiaojing Liu, Yuntao Wang

Remote photoplethysmography (rPPG) enables non-contact, continuous monitoring of physiological signals and offers a practical alternative to traditional health sensing methods.

Heart rate estimation Multi-Task Learning

Simplifying Root Cause Analysis in Kubernetes with StateGraph and LLM

no code implementations3 Jun 2025 Yong Xiang, Charley Peter Chen, Liyi Zeng, Wei Yin, Xin Liu, Hu Li, Wei Xu

We evaluate SynergyRCA using datasets from two production Kubernetes clusters, highlighting its capacity to identify numerous root causes, including novel ones, with high efficiency and precision.

On the Necessity of Multi-Domain Explanation: An Uncertainty Principle Approach for Deep Time Series Models

no code implementations3 Jun 2025 Shahbaz Rezaei, Avishai Halev, Xin Liu

This principle establishes a lower bound on how much a signal can be simultaneously localized in both the time and frequency domains.

Time Series

Period-LLM: Extending the Periodic Capability of Multimodal Large Language Model

1 code implementation CVPR 2025 Yuting Zhang, Hao Lu, Qingyong Hu, Yin Wang, Kaishen Yuan, Xin Liu, Kaishun Wu

Periodic or quasi-periodic phenomena reveal intrinsic characteristics in various natural processes, such as weather patterns, movement behaviors, traffic flows, and biological signals.

Language Modeling Language Modelling +2

Matryoshka Model Learning for Improved Elastic Student Models

no code implementations29 May 2025 Chetan Verma, Aditya Srinivas Timmaraju, Cho Jui-Hsieh, Suyash Damle, Ngot Bui, Yang Zhang, Wen Chen, Xin Liu, Prateek Jain, Inderjit S Dhillon

Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development.

LAMBADA Math +1

BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum

no code implementations27 May 2025 Yubin Kim, Zhiyuan Hu, Hyewon Jeong, Eugene Park, Shuyue Stella Li, Chanwoo Park, Shiyun Xiong, Mingyu Lu, Hyeonhoon Lee, Xin Liu, Daniel McDuff, Cynthia Breazeal, Samir Tulebaev, Hae Won Park

To address this, we propose BehaviorSFT, a novel training strategy using behavioral tokens to explicitly condition LLMs for dynamic behavioral selection along this spectrum.

EcomScriptBench: A Multi-task Benchmark for E-commerce Script Planning via Step-wise Intention-Driven Product Association

no code implementations21 May 2025 Weiqi Wang, Limeng Cui, Xin Liu, Sreyashi Nag, Wenju Xu, Chen Luo, Sheikh Muhammad Sarwar, Yang Li, Hansu Gu, Hui Liu, Changlong Yu, Jiaxin Bai, Yifan Gao, Haiyang Zhang, Qi He, Shuiwang Ji, Yangqiu Song

We propose a novel framework that enables the scalable generation of product-enriched scripts by associating products with each step based on the semantic similarity between the actions and their purchase intentions.

Semantic Similarity Semantic Textual Similarity

OmniFC: Rethinking Federated Clustering via Lossless and Secure Distance Reconstruction

no code implementations19 May 2025 Jie Yan, Xin Liu, Zhong-Yuan Zhang

Federated clustering (FC) aims to discover global cluster structures across decentralized clients without sharing raw data, making privacy preservation a fundamental requirement.

Clustering

FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning

1 code implementation19 May 2025 Zhuozhao Hu, Kaishen Yuan, Xin Liu, Zitong Yu, Yuan Zong, Jingang Shi, Huanjing Yue, Jingyu Yang

Facial Emotion Analysis (FEA) plays a crucial role in visual affective computing, aiming to infer a person's emotional state based on facial data.

Emotion Recognition

EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language Models

1 code implementation16 May 2025 Bohao Xing, Xin Liu, Guoying Zhao, Chengyu Liu, Xiaolan Fu, Heikki Kälviäinen

By evaluating 38 LLMs and MLLMs on EmotionHallucer, we reveal that: i) most current models exhibit substantial issues with emotion hallucinations; ii) closed-source models outperform open-source ones in detecting emotion hallucinations, and reasoning capability provides additional advantages; iii) existing models perform better in emotion psychology knowledge than in multimodal emotion perception.

Hallucination

VeriFact: Enhancing Long-Form Factuality Evaluation with Refined Fact Extraction and Reference Facts

no code implementations14 May 2025 Xin Liu, Lechen Zhang, Sheza Munir, Yiyang Gu, Lu Wang

Large language models (LLMs) excel at generating long-form responses, but evaluating their factuality remains challenging due to complex inter-sentence dependencies within the generated facts.

Benchmarking Form +1

Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing

no code implementations14 May 2025 Yingjie Ma, Xun Lin, Zitong Yu, Xin Liu, Xiaochen Yuan, Weicheng Xie, Linlin Shen

We also design a \textbf{U}-shaped \textbf{D}ual \textbf{S}pace \textbf{A}daptation (\textbf{U-DSA}) module to enhance the adaptability of representations while maintaining generalization performance.

cross-modal alignment Denoising +3

Implet: A Post-hoc Subsequence Explainer for Time Series Models

1 code implementation13 May 2025 Fanyu Meng, Ziwen Kan, Shahbaz Rezaei, Zhaodan Kong, Xin Chen, Xin Liu

Explainability in time series models is crucial for fostering trust, facilitating debugging, and ensuring interpretability in real-world applications.

Time Series Time Series Classification

Seed1.5-VL Technical Report

no code implementations11 May 2025 Dong Guo, Faming Wu, Feida Zhu, Fuxing Leng, Guang Shi, Haobin Chen, Haoqi Fan, Jian Wang, Jianyu Jiang, Jiawei Wang, Jingji Chen, Jingjia Huang, Kang Lei, Liping Yuan, Lishu Luo, PengFei Liu, Qinghao Ye, Rui Qian, Shen Yan, Shixiong Zhao, Shuai Peng, Shuangye Li, Sihang Yuan, Sijin Wu, Tianheng Cheng, Weiwei Liu, Wenqian Wang, Xianhan Zeng, Xiao Liu, Xiaobo Qin, Xiaohan Ding, Xiaojun Xiao, Xiaoying Zhang, Xuanwei Zhang, Xuehan Xiong, Yanghua Peng, Yangrui Chen, Yanwei Li, Yanxu Hu, Yi Lin, Yiyuan Hu, Yiyuan Zhang, Youbin Wu, Yu Li, Yudong Liu, Yue Ling, Yujia Qin, Zanbo Wang, Zhiwu He, Aoxue Zhang, Bairen Yi, Bencheng Liao, Can Huang, Can Zhang, Chaorui Deng, Chaoyi Deng, Cheng Lin, Cheng Yuan, Chenggang Li, Chenhui Gou, Chenwei Lou, Chengzhi Wei, Chundian Liu, Chunyuan Li, Deyao Zhu, Donghong Zhong, Feng Li, Feng Zhang, Gang Wu, Guodong Li, Guohong Xiao, Haibin Lin, Haihua Yang, Haoming Wang, Heng Ji, Hongxiang Hao, Hui Shen, Huixia Li, Jiahao Li, Jialong Wu, Jianhua Zhu, Jianpeng Jiao, Jiashi Feng, Jiaze Chen, Jianhui Duan, Jihao Liu, Jin Zeng, Jingqun Tang, Jingyu Sun, Joya Chen, Jun Long, Junda Feng, Junfeng Zhan, Junjie Fang, Junting Lu, Kai Hua, Kai Liu, Kai Shen, Kaiyuan Zhang, Ke Shen, Ke Wang, Keyu Pan, Kun Zhang, Kunchang Li, Lanxin Li, Lei LI, Lei Shi, Li Han, Liang Xiang, Liangqiang Chen, Lin Chen, Lin Li, Lin Yan, Liying Chi, Longxiang Liu, Mengfei Du, Mingxuan Wang, Ningxin Pan, Peibin Chen, Pengfei Chen, Pengfei Wu, Qingqing Yuan, Qingyao Shuai, Qiuyan Tao, Renjie Zheng, Renrui Zhang, Ru Zhang, Rui Wang, Rui Yang, Rui Zhao, Shaoqiang Xu, Shihao Liang, Shipeng Yan, Shu Zhong, Shuaishuai Cao, Shuangzhi Wu, Shufan Liu, Shuhan Chang, Songhua Cai, Tenglong Ao, Tianhao Yang, Tingting Zhang, Wanjun Zhong, Wei Jia, Wei Weng, Weihao Yu, Wenhao Huang, Wenjia Zhu, Wenli Yang, Wenzhi Wang, Xiang Long, XiangRui Yin, Xiao Li, Xiaolei Zhu, Xiaoying Jia, Xijin Zhang, Xin Liu, Xinchen Zhang, Xinyu Yang, Xiongcai Luo, Xiuli Chen, Xuantong Zhong, Xuefeng Xiao, Xujing Li, Yan Wu, Yawei Wen, Yifan Du, Yihao Zhang, Yining Ye, Yonghui Wu, Yu Liu, Yu Yue, Yufeng Zhou, Yufeng Yuan, Yuhang Xu, Yuhong Yang, Yun Zhang, Yunhao Fang, Yuntao Li, Yurui Ren, Yuwen Xiong, Zehua Hong, Zehua Wang, Zewei Sun, Zeyu Wang, Zhao Cai, Zhaoyue Zha, Zhecheng An, Zhehui Zhao, Zhengzhuo Xu, Zhipeng Chen, Zhiyong Wu, Zhuofan Zheng, ZiHao Wang, Zilong Huang, Ziyu Zhu, Zuquan Song

We present Seed1. 5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning.

Mixture-of-Experts Multimodal Reasoning +2

Understanding Stragglers in Large Model Training Using What-if Analysis

1 code implementation9 May 2025 JinKun Lin, Ziheng Jiang, Zuquan Song, Sida Zhao, Menghan Yu, Zhanghan Wang, Chenyuan Wang, Zuocheng Shi, Xiang Shi, Wei Jia, Zherui Liu, Shuguang Wang, Haibin Lin, Xin Liu, Aurojit Panda, Jinyang Li

Large language model (LLM) training is one of the most demanding distributed computations today, often requiring thousands of GPUs with frequent synchronization across machines.

Language Modeling Language Modelling +1

Passive Measurement of Autonomic Arousal in Real-World Settings

no code implementations30 Apr 2025 Samy Abdel-Ghaffar, Isaac Galatzer-Levy, Conor Heneghan, Xin Liu, Sarah Kernasovskiy, Brennan Garrett, Andrew Barakat, Daniel McDuff

The autonomic nervous system (ANS) is activated during stress, which can have negative effects on cardiovascular health, sleep, the immune system, and mental health.

valid

MMHCL: Multi-Modal Hypergraph Contrastive Learning for Recommendation

1 code implementation23 Apr 2025 Xu Guo, Tong Zhang, Fuyun Wang, Xudong Wang, Xiaoya Zhang, Xin Liu, Zhen Cui

For a comprehensive information exploration from user-product relations, we construct two hypergraphs, i. e. a user-to-user (u2u) hypergraph and an item-to-item (i2i) hypergraph, to mine shared preferences among users and intricate multimodal semantic resemblance among items, respectively.

Contrastive Learning Hypergraph Contrastive Learning +1

NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results

2 code implementations20 Apr 2025 Zheng Chen, Kai Liu, Jue Gong, Jingkai Wang, Lei Sun, Zongwei Wu, Radu Timofte, Yulun Zhang, Xiangyu Kong, Xiaoxuan Yu, Hyunhee Park, Suejin Han, Hakjae Jeon, Dafeng Zhang, Hyung-Ju Chun, Donghun Ryou, Inju Ha, Bohyung Han, Lu Zhao, Yuyi Zhang, Pengyu Yan, Jiawei Hu, Pengwei Liu, Fengjun Guo, Hongyuan Yu, Pufan Xu, Zhijuan Huang, Shuyuan Cui, Peng Guo, Jiahui Liu, Dongkai Zhang, Heng Zhang, Huiyuan Fu, Huadong Ma, Yanhui Guo, Sisi Tian, Xin Liu, Jinwen Liang, Jie Liu, Jie Tang, Gangshan Wu, Zeyu Xiao, Zhuoyuan Li, Yinxiang Zhang, Wenxuan Cai, Vijayalaxmi Ashok Aralikatti, Nikhil Akalwadi, G Gyaneshwar Rao, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudenagudi, Marcos V. Conde, Alejandro Merino, Bruno Longarela, Javier Abad, Weijun Yuan, Zhan Li, Zhanglu Chen, Boyang Yao, Aagam Jain, Milan Kumar Singh, Ankit Kumar, Shubh Kawa, Divyavardhan Singh, Anjali Sarvaiya, Kishor Upla, Raghavendra Ramachandra, Chia-Ming Lee, Yu-Fan Lin, Chih-Chung Hsu, Risheek V Hiremath, Yashaswini Palani, YuXuan Jiang, Qiang Zhu, Siyue Teng, Fan Zhang, Shuyuan Zhu, Bing Zeng, David Bull, Jingwei Liao, Yuqing Yang, Wenda Shao, Junyi Zhao, Qisheng Xu, Kele Xu, Sunder Ali Khowaja, Ik Hyun Lee, Snehal Singh Tomar, Rajarshi Ray, Klaus Mueller, Sachin Chaudhary, Surya Vashisth, Akshay Dudhane, Praful Hambarde, Satya Naryan Tazi, Prashant Patil, Santosh Kumar Vipparthi, Subrahmanyam Murala, Bilel Benjdira, Anas M. Ali, Wadii Boulila, Zahra Moammeri, Ahmad Mahmoudi-Aznaveh, Ali Karbasi, Hossein Motamednia, Liangyan Li, Guanhua Zhao, Kevin Le, Yimo Ning, Haoxuan Huang, Jun Chen

This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025.

Image Super-Resolution valid

KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference

no code implementations14 Apr 2025 Yuxuan Tian, Zihan Wang, Yebo Peng, Aomufei Yuan, Zhiming Wang, Bairen Yi, Xin Liu, Yong Cui, Tong Yang

Efficient inference of large language models (LLMs) is hindered by an ever-growing key-value (KV) cache, making KV cache compression a critical research direction.

HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation

1 code implementation13 Apr 2025 Pei Liu, Xin Liu, Ruoyu Yao, Junming Liu, Siyuan Meng, Ding Wang, Jun Ma

While Retrieval-Augmented Generation (RAG) augments Large Language Models (LLMs) with external knowledge, conventional single-agent RAG remains fundamentally limited in resolving complex queries demanding coordinated reasoning across heterogeneous data ecosystems.

Multimodal Reasoning RAG +2

Multi-Modal Hypergraph Enhanced LLM Learning for Recommendation

no code implementations13 Apr 2025 Xu Guo, Tong Zhang, Yuanzhi Wang, Chenxu Wang, Fuyun Wang, Xudong Wang, Xiaoya Zhang, Xin Liu, Zhen Cui

To this end, we propose a novel framework, Hypergraph Enhanced LLM Learning for multimodal Recommendation (HeLLM), designed to equip LLMs with the capability to capture intricate higher-order semantic correlations by fusing graph-level contextual signals with sequence-level behavioral patterns.

Contrastive Learning Multimodal Recommendation

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

no code implementations10 Apr 2025 ByteDance Seed, :, Jiaze Chen, Tiantian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu, Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang, Ruofei Zhu, Zhecheng An, Zhihao Bai, Yu Bao, Xingyan Bin, Jiangjie Chen, Feng Chen, Hongmin Chen, Riwei Chen, Liangqiang Chen, Zixin Chen, Jinsong Chen, Siyan Chen, Kaiyuan Chen, Zhi Chen, Jin Chen, Jiecao Chen, Jinxin Chi, Weinan Dai, Ning Dai, Jiahui Dai, Shihan Dou, Yantao Du, Zhengyin Du, Jianhui Duan, Chen Dun, Ting-Han Fan, Jiazhan Feng, Junda Feng, Ziyuan Feng, Yuwei Fu, Wenqi Fu, Hanjie Fu, Hao Ge, Hongyi Guo, Mingji Han, Li Han, Wenhao Hao, Xintong Hao, Qianyu He, Jerry He, Feng He, Wen Heng, Zehua Hong, Qi Hou, Liang Hu, Shengding Hu, Nan Hu, Kai Hua, Qi Huang, Ziyue Huang, Hongzhi Huang, Zihao Huang, Ting Huang, Wenhao Huang, Wei Jia, Bin Jia, Xiaoying Jia, Yuhua Jiang, Haobin Jiang, Ziheng Jiang, Kaihua Jiang, Chengquan Jiang, Jianpeng Jiao, Xiaoran Jin, Xing Jin, Xunhao Lai, Xiang Li, Liyi Li, Hongkai Li, Zheng Li, Shengxian Wan, Ya Wang, Yunshui Li, Chenggang Li, Niuniu Li, Siyu Li, Xi Li, Xiao Li, Aoyan Li, Yuntao Li, Nianning Liang, Xinnian Liang, Haibin Lin, Weijian Lin, Ye Lin, Zhicheng Liu, Guanlin Liu, Chenxiao Liu, Yan Liu, Gaohong Liu, Juncai Liu, Chundian Liu, Deyi Liu, Kaibo Liu, Siyao Liu, Qi Liu, Yongfei Liu, Kang Liu, Gan Liu, Boyi Liu, Rui Long, Weiqiang Lou, Chenwei Lou, Xiang Luo, Yao Luo, Caiping Lv, Heyang Lv, Bole Ma, Qianli Ma, Hongzhi Ma, Yiyuan Ma, Jin Ma, Wenchang Ma, Tingting Ma, Chen Mao, Qiyang Min, Zhe Nan, Guanghan Ning, Jinxiang Ou, Haojie Pan, Renming Pang, Yanghua Peng, Tao Peng, Lihua Qian, Mu Qiao, Meng Qu, Cheng Ren, Hongbin Ren, Yong Shan, Wei Shen, Ke Shen, Kai Shen, Guangming Sheng, Jinlong Shi, Wenlei Shi, Guang Shi, Shuai Shuai Cao, Yuxin Song, Zuquan Song, Jing Su, Yifan Sun, Tao Sun, Zewei Sun, Borui Wan, Xiaohui Wang, Xi Wang, Shuguang Wang, Jun Wang, Qinlong Wang, Chenyuan Wang, Shuai Wang, Zihan Wang, Changbao Wang, Jiaqiang Wang, Shihang Wang, Xuwu Wang, Zaiyuan Wang, Yuxuan Wang, Wenqi Wang, Taiqing Wang, Chengzhi Wei, Houmin Wei, Ziyun Wei, Shufa Wei, Zheng Wu, Yonghui Wu, Yangjun Wu, Bohong Wu, Shuang Wu, Jingqiao Wu, Ning Wu, Shuangzhi Wu, Jianmin Wu, Chenguang Xi, Fan Xia, Yuqiao Xian, Liang Xiang, Boren Xiang, Bowen Xiao, Zhen Xiao, Xia Xiao, Yongsheng Xiao, Chao Xin, Shulin Xin, Yuwen Xiong, Jingjing Xu, Ziwen Xu, Chenyin Xu, Jiayi Xu, Yifan Xu, Wei Xu, Yufei Xu, Shikun Xu, Shipeng Yan, Shen Yan, Qingping Yang, Xi Yang, Tianhao Yang, Yuehang Yang, Yuan Yang, Ximing Yang, Zeyu Yang, Guang Yang, Yifan Yang, Xuesong Yao, Bairen Yi, Fan Yin, Jianian Yin, Ziqiang Ying, Xiangyu Yu, Hongli Yu, Song Yu, Menghan Yu, Huan Yu, Siyu Yuan, Jun Yuan, Yutao Zeng, Tianyang Zhan, Zheng Zhang, Yun Zhang, Mofan Zhang, Wang Zhang, Ru Zhang, Zhi Zhang, Tianqi Zhang, Xinyi Zhang, Zhexi Zhang, Sijun Zhang, Wenqiang Zhang, Xiangxiang Zhang, Yongtao Zhang, Yuyu Zhang, Ge Zhang, He Zhang, Yue Zhang, Renjie Zheng, Ningxin Zheng, Zhuolin Zheng, Yaowei Zheng, Chen Zheng, Xiaoyun Zhi, Wanjun Zhong, Cheng Zhong, Zheng Zhong, Baoquan Zhong, Xun Zhou, Na Zhou, Huan Zhou, Hang Zhu, Defa Zhu, Wenjia Zhu, Lei Zuo

We introduce Seed1. 5-Thinking, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks.

Mixture-of-Experts reinforcement-learning +1

AU-TTT: Vision Test-Time Training model for Facial Action Unit Detection

no code implementations30 Mar 2025 Bohao Xing, Kaishen Yuan, Zitong Yu, Xin Liu, Heikki Kälviäinen

Facial Action Units (AUs) detection is a cornerstone of objective facial expression analysis and a critical focus in affective computing.

Action Unit Detection Facial Action Unit Detection +1

A Scalable Framework for Evaluating Health Language Models

no code implementations30 Mar 2025 Neil Mallinar, A. Ali Heydari, Xin Liu, Anthony Z. Faranesh, Brent Winslow, Nova Hammerquist, Benjamin Graef, Cathy Speed, Mark Malhotra, Shwetak Patel, Javier L. Prieto, Daniel McDuff, Ahmed A. Metwally

Our approach is based on recent work in more general evaluation settings that contrasts a smaller set of complex evaluation targets with a larger set of more precise, granular targets answerable with simple boolean responses.

Towards Fully Automated Decision-Making Systems for Greenhouse Control: Challenges and Opportunities

no code implementations27 Mar 2025 Yongshuai Liu, Taeyeong Choi, Xin Liu

Machine learning has been successful in building control policies to drive a complex system to desired states in various applications (e. g. games, robotics, etc.).

Decision Making

Look Before Leap: Look-Ahead Planning with Uncertainty in Reinforcement Learning

no code implementations26 Mar 2025 Yongshuai Liu, Xin Liu

In the policy optimization phase, we leverage an uncertainty-driven exploratory policy to actively collect diverse training samples, resulting in improved model accuracy and overall performance of the RL agent.

Atari Games Model-based Reinforcement Learning

Substance over Style: Evaluating Proactive Conversational Coaching Agents

no code implementations25 Mar 2025 Vidya Srinivas, Xuhai Xu, Xin Liu, Kumar Ayush, Isaac Galatzer-Levy, Shwetak Patel, Daniel McDuff, Tim Althoff

While NLP research has made strides in conversational tasks, many approaches focus on single-turn responses with well-defined objectives or evaluation criteria.

Adventurer: Exploration with BiGAN for Deep Reinforcement Learning

no code implementations24 Mar 2025 Yongshuai Liu, Xin Liu

Unfortunately, no single algorithm outperforms all others in all tasks and most of them struggle with tasks with high-dimensional and complex observations.

Atari Games Deep Reinforcement Learning +2

ZeroMerge: Parameter-Free KV Cache Compression for Memory-Efficient Long-Context LLMs

2 code implementations13 Mar 2025 Xin Liu, Pei Liu, Guoming Tang

The linear growth of key-value (KV) cache memory and quadratic computational complexity pose significant bottlenecks for large language models (LLMs) in long-context processing.

CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

1 code implementation CVPR 2025 Xin Liu, Jie Liu, Jie Tang, Gangshan Wu

SR typically leverages the redundancy of images for reconstruction, and this redundancy appears not only in local regions but also in long-range regions.

Image Super-Resolution

A Graph-Partitioning Based Continuous Optimization Approach to Semi-supervised Clustering Problems

no code implementations6 Mar 2025 Wei Liu, Xin Liu, Michael K. Ng, Zaikun Zhang

In this work, we view the semi-supervised clustering task as a partitioning problem on a graph associated with the given dataset, where the similarity matrix includes a scaling parameter to reflect the must-link constraints.

Clustering graph partitioning

The Distributionally Robust Optimization Model of Sparse Principal Component Analysis

no code implementations4 Mar 2025 Lei Wang, Xin Liu, Xiaojun Chen

We prove the Riemannian gradient consistency and global convergence of our algorithm to a stationary point of the nonsmooth minimization problem.

Riemannian optimization

Passive Heart Rate Monitoring During Smartphone Use in Everyday Life

no code implementations4 Mar 2025 Shun Liao, Paolo Di Achille, Jiang Wu, Silviu Borac, Jonathan Wang, Xin Liu, Eric Teasley, Lawrence Cai, Yuzhe Yang, Yun Liu, Daniel McDuff, Hao-Wei Su, Brent Winslow, Anupam Pathak, Shwetak Patel, James A. Taylor, Jameson K. Rogers, Ming-Zher Poh

Resting heart rate (RHR) is an important biomarker of cardiovascular health and mortality, but tracking it longitudinally generally requires a wearable device, limiting its availability.

AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning

1 code implementation CVPR 2025 Yuheng Xu, Shijie Yang, Xin Liu, Jie Liu, Jie Tang, Gangshan Wu

Moreover, their reliance on fixed sampling patterns limits both accuracy and the ability to capture fine details in low-resolution images.

Image Super-Resolution

DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing

no code implementations1 Mar 2025 Jingyi Yang, Xun Lin, Zitong Yu, Liepiao Zhang, Xin Liu, Hui Li, Xiaochen Yuan, Xiaochun Cao

We identify two main types of misalignment: (1) \textbf{Intra-domain modality misalignment}, where the importance of each modality varies across different attacks.

Domain Generalization Face Anti-Spoofing

ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs

no code implementations28 Feb 2025 Hao Ge, Junda Feng, Qi Huang, Fangcheng Fu, Xiaonan Nie, Lei Zuo, Haibin Lin, Bin Cui, Xin Liu

The mismatch between data heterogeneity and static mesh causes redundant communication and imbalanced computation, degrading the training efficiency.

ReCon: Enhancing True Correspondence Discrimination through Relation Consistency for Robust Noisy Correspondence Learning

1 code implementation CVPR 2025 Quanxing Zha, Xin Liu, Shu-Juan Peng, Yiu-ming Cheung, Xing Xu, Nannan Wang

To address this problem, we propose a general Relation Consistency learning framework, namely ReCon, to accurately discriminate the true correspondences among the multimodal data and thus effectively mitigate the adverse impact caused by mismatches.

Cross-modal retrieval with noisy correspondence Image-text matching +2

BIG-Bench Extra Hard

1 code implementation26 Feb 2025 Mehran Kazemi, Bahare Fatemi, Hritik Bansal, John Palowitch, Chrysovalantis Anastasiou, Sanket Vaibhav Mehta, Lalit K. Jain, Virginia Aglietti, Disha Jindal, Peter Chen, Nishanth Dikkala, Gladys Tyen, Xin Liu, Uri Shalit, Silvia Chiappa, Kate Olszewska, Yi Tay, Vinh Q. Tran, Quoc V. Le, Orhan Firat

One particular exception is the BIG-Bench dataset, which has served as a crucial benchmark for evaluating the general reasoning capabilities of LLMs, thanks to its diverse set of challenging tasks that allowed for a comprehensive assessment of general reasoning across various skills within a unified framework.

Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation

no code implementations26 Feb 2025 Xin Liu, Ziyue Zhang, Jingxin Nie

Traditional psychological experiments utilizing naturalistic stimuli face challenges in manual annotation and ecological validity.

Question Answering valid +1

Medical Hallucinations in Foundation Models and Their Impact on Healthcare

1 code implementation26 Feb 2025 Yubin Kim, Hyewon Jeong, Shan Chen, Shuyue Stella Li, Mingyu Lu, Kumail Alhamoud, Jimin Mun, Cristina Grau, Minseok Jung, Rodrigo Gameiro, Lizhou Fan, Eugene Park, Tristan Lin, Joonsik Yoon, Wonjin Yoon, Maarten Sap, Yulia Tsvetkov, Paul Liang, Xuhai Xu, Xin Liu, Daniel McDuff, Hyeonhoon Lee, Hae Won Park, Samir Tulebaev, Cynthia Breazeal

Our contributions include (1) a taxonomy for understanding and addressing medical hallucinations, (2) benchmarking models using medical hallucination dataset and physician-annotated LLM responses to real medical cases, providing direct insight into the clinical impact of hallucinations, and (3) a multi-national clinician survey on their experiences with medical hallucinations.

Benchmarking Hallucination

Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data

1 code implementation25 Feb 2025 Siqi Guo, Ilgee Hong, Vicente Balmaseda, Changlong Yu, Liang Qiu, Xin Liu, Haoming Jiang, Tuo Zhao, Tianbao Yang

To address its limitations, the existing common strategy is to follow SFT with a separate phase of preference optimization (PO), which relies on either human-labeled preference data or a strong reward model to guide the learning process.

VLM-E2E: Enhancing End-to-End Autonomous Driving with Multimodal Driver Attention Fusion

no code implementations25 Feb 2025 Pei Liu, Haipeng Liu, Haichao Liu, Xin Liu, Jinxin Ni, Jun Ma

Our method integrates textual representations into Bird's-Eye-View (BEV) features for semantic supervision, which enables the model to learn richer feature representations that explicitly capture the driver's attentional semantics.

Autonomous Driving Navigate +1

PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving

no code implementations22 Feb 2025 Mihir Parmar, Xin Liu, Palash Goyal, Yanfei Chen, Long Le, Swaroop Mishra, Hossein Mobahi, Jindong Gu, Zifeng Wang, Hootan Nakhost, Chitta Baral, Chen-Yu Lee, Tomas Pfister, Hamid Palangi

Recent agent frameworks and inference-time algorithms often struggle with complex planning problems due to limitations in verifying generated plans or reasoning and varying complexity of instances within a single task.

IHEval: Evaluating Language Models on Following the Instruction Hierarchy

1 code implementation12 Feb 2025 Zhihan Zhang, Shiyang Li, Zixuan Zhang, Xin Liu, Haoming Jiang, Xianfeng Tang, Yifan Gao, Zheng Li, Haodong Wang, Zhaoxuan Tan, Yichuan Li, Qingyu Yin, Bing Yin, Meng Jiang

The instruction hierarchy, which establishes a priority order from system messages to user messages, conversation history, and tool outputs, is essential for ensuring consistent and safe behavior in language models (LMs).

Instruction Following

CodePhys: Robust Video-based Remote Physiological Measurement through Latent Codebook Querying

no code implementations11 Feb 2025 Shuyang Chu, Menghan Xia, Mengyao Yuan, Xin Liu, Tapio Seppanen, Guoying Zhao, Jingang Shi

In this paper, we propose a novel method named CodePhys, which innovatively treats rPPG measurement as a code query task in a noise-free proxy space (i. e., codebook) constructed by ground-truth PPG signals.

Heart rate estimation

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

no code implementations10 Feb 2025 Yuchen Zhuang, Jingfeng Yang, Haoming Jiang, Xin Liu, Kewei Cheng, Sanket Lokegaonkar, Yifan Gao, Qing Ping, Tianyi Liu, Binxuan Huang, Zheng Li, Zhengyang Wang, Pei Chen, Ruijie Wang, Rongzhi Zhang, Nasser Zalmout, Priyanka Nigam, Bing Yin, Chao Zhang

Due to the scarcity of agent-oriented pre-training data, LLM-based autonomous agents typically rely on complex prompting or extensive fine-tuning, which often fails to introduce new capabilities while preserving strong generalizability.

Confidence intervals for intentionally biased estimators

no code implementations1 Feb 2025 David M. Kaplan, Xin Liu

We propose and study three confidence intervals (CIs) centered at an estimator that is intentionally biased to reduce mean squared error.

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

4 code implementations22 Jan 2025 DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Qu, Hui Li, JianZhong Guo, Jiashi Li, Jiawei Wang, Jingchang Chen, Jingyang Yuan, Junjie Qiu, Junlong Li, J. L. Cai, Jiaqi Ni, Jian Liang, Jin Chen, Kai Dong, Kai Hu, Kaige Gao, Kang Guan, Kexin Huang, Kuai Yu, Lean Wang, Lecong Zhang, Liang Zhao, Litong Wang, Liyue Zhang, Lei Xu, Leyi Xia, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Meng Li, Miaojun Wang, Mingming Li, Ning Tian, Panpan Huang, Peng Zhang, Qiancheng Wang, Qinyu Chen, Qiushi Du, Ruiqi Ge, Ruisong Zhang, Ruizhe Pan, Runji Wang, R. J. Chen, R. L. Jin, Ruyi Chen, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shiyu Wang, Shuiping Yu, Shunfeng Zhou, Shuting Pan, S. S. Li, Shuang Zhou, Shaoqing Wu, Shengfeng Ye, Tao Yun, Tian Pei, Tianyu Sun, T. Wang, Wangding Zeng, Wanjia Zhao, Wen Liu, Wenfeng Liang, Wenjun Gao, Wenqin Yu, Wentao Zhang, W. L. Xiao, Wei An, Xiaodong Liu, Xiaohan Wang, Xiaokang Chen, Xiaotao Nie, Xin Cheng, Xin Liu, Xin Xie, Xingchao Liu, Xinyu Yang, Xinyuan Li, Xuecheng Su, Xuheng Lin, X. Q. Li, Xiangyue Jin, Xiaojin Shen, Xiaosha Chen, Xiaowen Sun, Xiaoxiang Wang, Xinnan Song, Xinyi Zhou, Xianzu Wang, Xinxia Shan, Y. K. Li, Y. Q. Wang, Y. X. Wei, Yang Zhang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Wang, Yi Yu, Yichao Zhang, Yifan Shi, Yiliang Xiong, Ying He, Yishi Piao, Yisong Wang, Yixuan Tan, Yiyang Ma, Yiyuan Liu, Yongqiang Guo, Yuan Ou, Yuduan Wang, Yue Gong, Yuheng Zou, Yujia He, Yunfan Xiong, Yuxiang Luo, Yuxiang You, Yuxuan Liu, Yuyang Zhou, Y. X. Zhu, Yanhong Xu, Yanping Huang, Yaohui Li, Yi Zheng, Yuchen Zhu, Yunxian Ma, Ying Tang, Yukun Zha, Yuting Yan, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhean Xu, Zhenda Xie, Zhengyan Zhang, Zhewen Hao, Zhicheng Ma, Zhigang Yan, Zhiyu Wu, Zihui Gu, Zijia Zhu, Zijun Liu, Zilin Li, Ziwei Xie, Ziyang Song, Zizheng Pan, Zhen Huang, Zhipeng Xu, Zhongyu Zhang, Zhen Zhang

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.

Mathematical Reasoning Multi-task Language Understanding +2

Advancing ALS Applications with Large-Scale Pre-training: Dataset Development and Downstream Assessment

1 code implementation9 Jan 2025 Haoyi Xiu, Xin Liu, TaeHoon Kim, Kyoung-Sook Kim

In this study, we address this gap by constructing a large-scale ALS point cloud dataset and evaluating its impact on downstream applications.

Scene Recognition Self-Supervised Learning +1

Microservice Deployment in Space Computing Power Networks via Robust Reinforcement Learning

no code implementations8 Jan 2025 Zhiyong Yu, Yuning Jiang, Xin Liu, Yuanming Shi, Chunxiao Jiang, Linling Kuang

With the growing demand for Earth observation, it is important to provide reliable real-time remote sensing inference services to meet the low-latency requirements.

Earth Observation

No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition

1 code implementation CVPR 2025 Rong Qin, Xin Liu, Xingyu Liu, Jiaxuan Liu, Jinglei Shi, Liang Lin, Jufeng Yang

Over the last decade, many notable methods have emerged to tackle the computational resource challenge of the high resolution image recognition (HRIR).

Multiple Instance Learning

STDD: Spatio-Temporal Dual Diffusion for Video Generation

no code implementations CVPR 2025 Shuaizhen Yao, Xiaoya Zhang, Xin Liu, Mengyi Liu, Zhen Cui

In this work, we propose an explicit Spatio-Temporal Dual Diffusion (STDD) method by principledly extending the standard diffusion model to a spatio-temporal diffusion model for joint spatial and temporal noise propagation/reduction.

Text-to-Video Generation Video Generation

Distribution Prototype Diffusion Learning for Open-set Supervised Anomaly Detection

no code implementations CVPR 2025 Fuyun Wang, Tong Zhang, Yuanzhi Wang, Yide Qiu, Xin Liu, Xu Guo, Zhen Cui

In Open-set Supervised Anomaly Detection (OSAD), the existing methods typically generate pseudo anomalies to compensate for the scarcity of observed anomaly samples, while overlooking critical priors of normal samples, leading to less effective discriminative boundaries.

Supervised Anomaly Detection

FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding

no code implementations CVPR 2025 Rong Gao, Xin Liu, Zhuozhao Hu, Bohao Xing, Baiqiang Xia, Zitong Yu, Heikki Kälviäinen

Figure skating, known as the "Art on Ice," is among the most artistic sports, challenging to understand due to its blend of technical elements (like jumps and spins) and overall artistic expression.

Action Recognition Multiple-choice +1

DeepSeek-V3 Technical Report

4 code implementations27 Dec 2024 DeepSeek-AI, Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, JianZhong Guo, Jiaqi Ni, Jiashi Li, Jiawei Wang, Jin Chen, Jingchang Chen, Jingyang Yuan, Junjie Qiu, Junlong Li, Junxiao Song, Kai Dong, Kai Hu, Kaige Gao, Kang Guan, Kexin Huang, Kuai Yu, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Litong Wang, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qiancheng Wang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruisong Zhang, Ruizhe Pan, Runji Wang, Runxin Xu, Ruoyu Zhang, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Shuting Pan, T. Wang, Tao Yun, Tian Pei, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wanjia Zhao, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wenqin Yu, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaokang Zhang, Xiaosha Chen, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Cheng, Xin Liu, Xin Xie, Xingchao Liu, Xingkai Yu, Xinnan Song, Xinxia Shan, Xinyi Zhou, Xinyu Yang, Xinyuan Li, Xuecheng Su, Xuheng Lin, Y. K. Li, Y. Q. Wang, Y. X. Wei, Y. X. Zhu, Yang Zhang, Yanhong Xu, Yanping Huang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Li, Yaohui Wang, Yi Yu, Yi Zheng, Yichao Zhang, Yifan Shi, Yiliang Xiong, Ying He, Ying Tang, Yishi Piao, Yisong Wang, Yixuan Tan, Yiyang Ma, Yiyuan Liu, Yongqiang Guo, Yu Wu, Yuan Ou, Yuchen Zhu, Yuduan Wang, Yue Gong, Yuheng Zou, Yujia He, Yukun Zha, Yunfan Xiong, Yunxian Ma, Yuting Yan, Yuxiang Luo, Yuxiang You, Yuxuan Liu, Yuyang Zhou, Z. F. Wu, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhean Xu, Zhen Huang, Zhen Zhang, Zhenda Xie, Zhengyan Zhang, Zhewen Hao, Zhibin Gou, Zhicheng Ma, Zhigang Yan, Zhihong Shao, Zhipeng Xu, Zhiyu Wu, Zhongyu Zhang, Zhuoshu Li, Zihui Gu, Zijia Zhu, Zijun Liu, Zilin Li, Ziwei Xie, Ziyang Song, Ziyi Gao, Zizheng Pan

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

Language Modeling Language Modelling +1

The Impact of Cut Layer Selection in Split Federated Learning

no code implementations20 Dec 2024 Justin Dachille, Chao Huang, Xin Liu

Split Federated Learning (SFL) is a distributed machine learning paradigm that combines federated learning and split learning.

Federated Learning

Sample-efficient Unsupervised Policy Cloning from Ensemble Self-supervised Labeled Videos

no code implementations14 Dec 2024 Xin Liu, Yaran Chen

In this paper, we try to let machines replicate this efficient watching-and-learning process through Unsupervised Policy from Ensemble Self-supervised labeled Videos (UPESV), a novel framework to efficiently learn policies from videos without any other expert supervision.

SpecFuse: Ensembling Large Language Models via Next-Segment Prediction

no code implementations10 Dec 2024 Bo Lv, Chen Tang, Yanan Zhang, Xin Liu, Yue Yu, Ping Luo

In this paper, we propose SpecFuse, a novel ensemble framework that outputs the fused result by iteratively producing the next segment through collaboration among LLMs.

Prediction

Hyperspectral Image Spectral-Spatial Feature Extraction via Tensor Principal Component Analysis

no code implementations8 Dec 2024 Yuemei Ren, Liang Liao, Stephen John Maybank, Yanning Zhang, Xin Liu

This paper addresses the challenge of spectral-spatial feature extraction for hyperspectral image classification by introducing a novel tensor-based framework.

Hyperspectral image analysis Hyperspectral Image Classification +1

Safe and Efficient Online Convex Optimization with Linear Budget Constraints and Partial Feedback

no code implementations5 Dec 2024 Shanqi Liu, Xin Liu

This paper studies online convex optimization with unknown linear budget constraints, where only the gradient information of the objective and the bandit feedback of constraint functions are observed.

Distributed Sign Momentum with Local Steps for Training Transformers

1 code implementation26 Nov 2024 Shuhua Yu, Ding Zhou, Cong Xie, An Xu, Zhi Zhang, Xin Liu, Soummya Kar

Pre-training Transformer models is resource-intensive, and recent studies have shown that sign momentum is an efficient technique for training large-scale deep learning models, particularly Transformers.

Federated Learning

A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography

1 code implementation22 Nov 2024 Kegang Wang, Jiankai Tang, Yantao Wei, Mingxuan Liu, Xin Liu, Yuntao Wang

Remote photoplethysmography (rPPG) extracts PPG signals from subtle color changes in facial videos, showing strong potential for health applications.

SEMPose: A Single End-to-end Network for Multi-object Pose Estimation

no code implementations21 Nov 2024 Xin Liu, Hao Wang, Shibei Xue, Dezong Zhao

On the LM-O and YCB-V datasets, our method outperforms other RGB-based single-model methods, achieving higher accuracy.

Object Pose Estimation

MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models

1 code implementation15 Nov 2024 Jianhong Tu, Zhuohao Ni, Nicholas Crispino, Zihao Yu, Michael Bendersky, Beliz Gunel, Ruoxi Jia, Xin Liu, Lingjuan Lyu, Dawn Song, Chenguang Wang

With a small number of visual instructions, this emerging language instruction following ability transfers well to the unseen vision datasets, outperforming the state of the art with greater training efficiency.

Instruction Following Zero-shot Generalization

PICZL: Image-based Photometric Redshifts for AGN

no code implementations11 Nov 2024 William Roster, Mara Salvato, Sven Krippendorf, Aman Saxena, Raphael Shirley, Johannes Buchner, Julien Wolf, Tom Dwelly, Franz E. Bauer, James Aird, Claudio Ricci, Roberto J. Assef, Scott F. Anderson, Xin Liu, Andrea Merloni, Jochen Weller, Kirpal Nandra

Instead, we employ readily available data products from the 10th Data Release of the Imaging Legacy Survey for DESI, covering > 20, 000 deg$^{2}$ with deep images and catalog-based photometry in the grizW1-W4 bands.

Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models

1 code implementation28 Oct 2024 Yilun Jin, Zheng Li, Chenwei Zhang, Tianyu Cao, Yifan Gao, Pratik Jayarao, Mao Li, Xin Liu, Ritesh Sarkhel, Xianfeng Tang, Haodong Wang, Zhengyang Wang, Wenju Xu, Jingfeng Yang, Qingyu Yin, Xian Li, Priyanka Nigam, Yi Xu, Kai Chen, Qiang Yang, Meng Jiang, Bing Yin

Shopping MMLU consists of 57 tasks covering 4 major shopping skills: concept understanding, knowledge reasoning, user behavior alignment, and multi-linguality, and can thus comprehensively evaluate the abilities of LLMs as general shop assistants.

Few-Shot Learning MMLU

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

1 code implementation28 Oct 2024 Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, Beidi Chen

By evaluating ShadowKV on a broad range of benchmarks, including RULER, LongBench, and Needle In A Haystack, and models like Llama-3. 1-8B, Llama-3-8B-1M, GLM-4-9B-1M, Yi-9B-200K, Phi-3-Mini-128K, and Qwen2-7B-128K, we demonstrate that it can support up to 6$\times$ larger batch sizes and boost throughput by up to 3. 04$\times$ on an A100 GPU without sacrificing accuracy, even surpassing the performance achievable with infinite batch size under the assumption of infinite GPU memory.

A Stock Price Prediction Approach Based on Time Series Decomposition and Multi-Scale CNN using OHLCT Images

no code implementations25 Oct 2024 Zhiyuan Pei, Jianqi Yan, Jin Yan, Bailing Yang, Ziyuan Li, Lin Zhang, Xin Liu, Yang Zhang

By utilizing CNN to learn sequential features and combining them with image features, we improve the accuracy of stock trend prediction on the A-share market stock dataset.

Stock Prediction Stock Price Prediction +2

Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization

1 code implementation25 Oct 2024 Xiyue Peng, Hengquan Guo, Jiawei Zhang, Dongqing Zou, Ziyu Shao, Honghao Wei, Xin Liu

To address this issue, we propose Rectified Policy Optimization (RePO), which replaces the expected safety constraint with critical safety constraints imposed on every prompt.

Safety Alignment

MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model

no code implementations22 Oct 2024 Meng Xu, Tong Zhang, Fuyun Wang, Yi Lei, Xin Liu, Zhen Cui

As dedicated to posters, MPDS stands out as the first image-text pair dataset to our knowledge, composing of 373k+ image-text pairs and 8k+ actor images (covering 4k+ actors).

4k 8k +2

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

no code implementations20 Oct 2024 Jinda Jia, Cong Xie, Hanlin Lu, Daoce Wang, Hao Feng, Chengming Zhang, Baixi Sun, Haibin Lin, Zhi Zhang, Xin Liu, Dingwen Tao

Recent years have witnessed a clear trend towards language models with an ever-increasing number of parameters, as well as the growing training overhead and memory usage.

Quantization

HACSurv: A Hierarchical Copula-Based Approach for Survival Analysis with Dependent Competing Risks

1 code implementation19 Oct 2024 Xin Liu, Weijia Zhang, Min-Ling Zhang

In this paper, we introduce HACSurv, a survival analysis method that learns Hierarchical Archimedean Copulas structures and cause-specific survival functions from data with competing risks.

Prognosis Survival Analysis +1

GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation

no code implementations17 Oct 2024 Ziwei Yang, Zheng Chen, Xin Liu, Rikuto Kotoge, Peng Chen, Yasuko Matsubara, Yasushi Sakurai, Jimeng Sun

Retrieving gene functional networks from knowledge databases presents a challenge due to the mismatch between disease networks and subtype-specific variations.

Graph Generation Graph Neural Network +1

Interpreting Inflammation Prediction Model via Tag-based Cohort Explanation

no code implementations17 Oct 2024 Fanyu Meng, Jules Larke, Xin Liu, Zhaodan Kong, Xin Chen, Danielle Lemay, Ilias Tagkopoulos

Machine learning is revolutionizing nutrition science by enabling systems to learn from data and make intelligent decisions.

Decision Making Feature Importance +2

CohEx: A Generalized Framework for Cohort Explanation

1 code implementation17 Oct 2024 Fanyu Meng, Xin Liu, Zhaodan Kong, Xin Chen

eXplainable Artificial Intelligence (XAI) has garnered significant attention for enhancing transparency and trust in machine learning models.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI)

Evidence of Cognitive Deficits andDevelopmental Advances in Generative AI: A Clock Drawing Test Analysis

no code implementations15 Oct 2024 Isaac R. Galatzer-Levy, Jed McGiffin, David Munday, Xin Liu, Danny Karmon, Ilia Labzovsky, Rivka Moroshko, Amir Zait, Daniel McDuff

Generative AI's rapid advancement sparks interest in its cognitive abilities, especially given its capacity for tasks like language understanding and code generation.

Code Generation

MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router

no code implementations15 Oct 2024 Yanyue Xie, Zhi Zhang, Ding Zhou, Cong Xie, Ziang Song, Xin Liu, Yanzhi Wang, Xue Lin, An Xu

Experimental results demonstrate that the Mixtral-8x7B model with 50% sparsity maintains 99% of the performance of the original model after the expert-wise knowledge distillation.

Knowledge Distillation Language Modeling +3

Improving Arabic Multi-Label Emotion Classification using Stacked Embeddings and Hybrid Loss Function

1 code implementation4 Oct 2024 Muhammad Azeem Aslam, Wang Jun, Nisar Ahmed, Muhammad Imran Zaman, Li Yanan, Hu Hongfei, Wang Shiyu, Xin Liu

In multi-label emotion classification, particularly for low-resource languages like Arabic, the challenges of class imbalance and label correlation hinder model performance, especially in accurately predicting minority emotions.

Classification Contrastive Learning +4

Constrained Reasoning Chains for Enhancing Theory-of-Mind in Large Language Models

1 code implementation20 Sep 2024 Zizheng Lin, Chunkit Chan, Yangqiu Song, Xin Liu

Afterward, CCoToM prompts LLMs to infer the queried ToM dimension based on the generated related ToM dimensions and corresponding causal relations.

Infrared and Visible Image Fusion with Hierarchical Human Perception

no code implementations14 Sep 2024 Guang Yang, Jie Li, Xin Liu, Zhusi Zhong, Xinbo Gao

Existing methods take pixel intensity, texture and high-level vision task information as the standards to determine preservation of information, lacking enhancement for human perception.

Infrared And Visible Image Fusion Language Modeling +1

DiffFAS: Face Anti-Spoofing via Generative Diffusion Models

1 code implementation13 Sep 2024 Xinxu Ge, Xin Liu, Zitong Yu, Jingang Shi, Chun Qi, Jie Li, Heikki Kälviäinen

Based on our analysis, we propose DiffFAS framework, which quantifies quality as prior information input into the network to counter image quality shift, and performs diffusion-based high-fidelity cross-domain and cross-attack types generation to counter image style shift.

Face Anti-Spoofing Face Recognition

A Double Tracking Method for Optimization with Decentralized Generalized Orthogonality Constraints

no code implementations8 Sep 2024 Lei Wang, Nachuan Xiao, Xin Liu

In this paper, we consider the decentralized optimization problems with generalized orthogonality constraints, where both the objective function and the constraint exhibit a distributed structure.

Explanation Space: A New Perspective into Time Series Interpretability

1 code implementation2 Sep 2024 Shahbaz Rezaei, Xin Liu

Human understandable explanation of deep learning models is necessary for many critical and sensitive applications.

Time Series

Benchmarking Counterfactual Interpretability in Deep Learning Models for Time Series Classification

no code implementations22 Aug 2024 Ziwen Kan, Shahbaz Rezaei, Xin Liu

We specifically redesign the metrics for sparsity and plausibility and introduce a new metric for consistency.

Benchmarking counterfactual +2

EMO-LLaMA: Enhancing Facial Emotion Understanding with Instruction Tuning

1 code implementation21 Aug 2024 Bohao Xing, Zitong Yu, Xin Liu, Kaishen Yuan, Qilang Ye, Weicheng Xie, Huanjing Yue, Jingyu Yang, Heikki Kälviäinen

However, current FER paradigms face challenges in generalization, lack semantic information aligned with natural language, and struggle to process both images and videos within a unified framework, making their application in multimodal emotion understanding and human-computer interaction difficult.

Facial Expression Recognition Facial Expression Recognition (FER)

Attention Is Not What You Need: Revisiting Multi-Instance Learning for Whole Slide Image Classification

no code implementations18 Aug 2024 Xin Liu, Weijia Zhang, Min-Ling Zhang

From the standard MIL assumptions, we propose a surprisingly simple yet effective instance-based MIL method for WSI classification (FocusMIL) based on max-pooling and forward amortized variational inference.

Classification image-classification +2

MU-MAE: Multimodal Masked Autoencoders-Based One-Shot Learning

no code implementations8 Aug 2024 Rex Liu, Xin Liu

Mu-MAE integrates a multimodal masked autoencoder with a synchronized masking strategy tailored for wearable sensors.

Human Activity Recognition One-Shot Learning

Diffusion Model Meets Non-Exemplar Class-Incremental Learning and Beyond

no code implementations6 Aug 2024 Jichuan Zhang, YaLi Li, Xin Liu, Shengjin Wang

Non-exemplar class-incremental learning (NECIL) is to resist catastrophic forgetting without saving old class samples.

class-incremental learning Class Incremental Learning +4

From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation

1 code implementation5 Aug 2024 Xin Liu, Chao Hao, Zitong Yu, Huanjing Yue, Jingyu Yang

ARR decomposes the action anticipation task into action recognition and sequence reasoning tasks, and effectively learns the statistical relationship between actions by next action prediction (NAP).

Action Anticipation Action Recognition +3

ByteCheckpoint: A Unified Checkpointing System for Large Foundation Model Development

no code implementations29 Jul 2024 Borui Wan, Mingji Han, Yiyao Sheng, Yanghua Peng, Haibin Lin, Mofan Zhang, Zhichao Lai, Menghan Yu, Junda Zhang, Zuquan Song, Xin Liu, Chuan Wu

In production, different LFMs are trained with various frameworks and storage backends, depending on model sizes and training scales.

Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter

no code implementations29 Jul 2024 Chao Liu, Xin Liu, Zitong Yu, Yonghong Hou, Huanjing Yue, Jingyu Yang

We initially conducted empirical analysis on the robustness of different modalities and observed that the skeleton modality is more robust than the RGB modality.

Action Recognition Adversarial Robustness +1

UGNCL: Uncertainty-Guided Noisy Correspondence Learning for Efficient Cross-Modal Matching

1 code implementation SIGIR 2024 Quanxing Zha, Xin Liu, Yiu-ming Cheung, Xing Xu, Nannan Wang, Jianjia Cao

Cross-modal matching has recently gained significant popularity to facilitate retrieval across multi-modal data, and existing works are highly relied on an implicit assumption that the training data pairs are perfectly aligned.

Cross-modal retrieval with noisy correspondence Image-text matching +1

Tuning Vision-Language Models with Candidate Labels by Prompt Alignment

no code implementations10 Jul 2024 Zhifang Zhang, Yuwei Niu, Xin Liu, Beibei Li

In order to improve its robustness, we propose a simple yet effective framework that better leverages the prior knowledge of VLMs to guide the learning process with candidate labels.

Prompt Learning

Zero-Shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model

no code implementations2 Jul 2024 Cong Cao, Huanjing Yue, Xin Liu, Jingyu Yang

In this paper, we propose the first framework for zero-shot video restoration and enhancement based on the pre-trained image diffusion model.

Image Restoration Video Restoration

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

no code implementations28 Jun 2024 Jinming Li, Yichen Zhu, Zhiyuan Xu, Jindong Gu, Minjie Zhu, Xin Liu, Ning Liu, Yaxin Peng, Feifei Feng, Jian Tang

It is fundamentally challenging for robots to serve as useful assistants in human environments because this requires addressing a spectrum of sub-problems across robotics, including perception, language understanding, reasoning, and planning.

Task Planning Visual Reasoning

Enhancing Language Model Factuality via Activation-Based Confidence Calibration and Guided Decoding

1 code implementation19 Jun 2024 Xin Liu, Farima Fatahi Bayat, Lu Wang

Built on top of ActCab, we further propose CoDec, a confidence-guided decoding strategy to elicit truthful answers with high confidence from LMs.

Language Modeling Language Modelling +1

What Are the Odds? Language Models Are Capable of Probabilistic Reasoning

1 code implementation18 Jun 2024 Akshay Paruchuri, Jake Garrison, Shun Liao, John Hernandez, Jacob Sunshine, Tim Althoff, Xin Liu, Daniel McDuff

Language models (LM) are capable of remarkably complex linguistic tasks; however, numerical reasoning is an area in which they frequently struggle.

WeatherQA: Can Multimodal Language Models Reason about Severe Weather?

1 code implementation17 Jun 2024 Chengqian Ma, Zhanxiang Hua, Alexandra Anderson-Frey, Vikram Iyer, Xin Liu, Lianhui Qin

In this work, we introduce WeatherQA, the first multimodal dataset designed for machines to reason about complex combinations of weather parameters (a. k. a., ingredients) and predict severe weather in real-world scenarios.

Data Integration

Balancing Embedding Spectrum for Recommendation

1 code implementation17 Jun 2024 Shaowen Peng, Kazunari Sugiyama, Xin Liu, Tsunenori Mine

In this work, we shed light on an issue in the existing pair-wise learning paradigm (i. e., the embedding collapse problem), that the representations tend to span a subspace of the whole embedding space, leading to a suboptimal solution and reducing the model capacity.

Contrastive Learning Recommendation Systems

How Powerful is Graph Filtering for Recommendation

1 code implementation13 Jun 2024 Shaowen Peng, Xin Liu, Kazunari Sugiyama, Tsunenori Mine

Based on this observation, we propose a generalized graph normalization G^2N to adjust the sharpness of spectral distribution in order to redistribute data noise to assure that it can be removed by graph filtering without training.

Collaborative Filtering

FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion

1 code implementation11 Jun 2024 Li-Wen Chang, Wenlei Bao, Qi Hou, Chengquan Jiang, Ningxin Zheng, Yinmin Zhong, Xuanrun Zhang, Zuquan Song, Chengji Yao, Ziheng Jiang, Haibin Lin, Xin Jin, Xin Liu

Overall, it can achieve up to 1. 24x speedups for training over Megatron-LM on a cluster of 128 GPUs with various GPU generations and interconnects, and up to 1. 66x and 1. 30x speedups for prefill and decoding inference over vLLM on a cluster with 8 GPUs with various GPU generations and interconnects.

Transforming Wearable Data into Health Insights using Large Language Model Agents

no code implementations10 Jun 2024 Mike A. Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, Xin Liu

Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data.

Code Generation Information Retrieval +4

Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning

no code implementations20 May 2024 Xin Liu, Yaran Chen, Dongbin Zhao

Employing auxiliary tasks allows the agent to enhance visual representation in a targeted manner, thereby improving the sample efficiency and performance of downstream RL.

continuous-control Continuous Control +2

A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine

no code implementations14 May 2024 Hanguang Xiao, Feizhong Zhou, Xingyue Liu, Tianqi Liu, Zhipeng Li, Xin Liu, Xiaoxuan Huang

Finally, the survey addresses the challenges confronting medical LLMs and MLLMs and proposes practical strategies and future directions for their integration into medicine.

Survey

A Decoupling and Aggregating Framework for Joint Extraction of Entities and Relations

no code implementations14 May 2024 Yao Wang, Xin Liu, Weikun Kong, Hai-Tao Yu, Teeradaj Racharak, Kyoung-Sook Kim, Minh Le Nguyen

Second, information interaction mainly focuses on the two subtasks, leaving the fine-grained informtion interaction among the subtask-specific features of encoding subjects, relations, and objects unexplored.

named-entity-recognition Named Entity Recognition +1

Towards Subgraph Isomorphism Counting with Graph Kernels

no code implementations13 May 2024 Xin Liu, Weiqi Wang, Jiaxin Bai, Yangqiu Song

Subgraph isomorphism counting is known as #P-complete and requires exponential time to find the accurate solution.

Graph Classification Representation Learning

Disttack: Graph Adversarial Attacks Toward Distributed GNN Training

1 code implementation10 May 2024 Yuxiang Zhang, Xin Liu, Meng Wu, Wei Yan, Mingyu Yan, Xiaochun Ye, Dongrui Fan

In this study, we introduce Disttack, the first framework of adversarial attacks for distributed GNN training that leverages the characteristics of frequent gradient updates in a distributed system.

Adversarial Attack Graph Learning

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

5 code implementations7 May 2024 DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Li, Hui Qu, J. L. Cai, Jian Liang, JianZhong Guo, Jiaqi Ni, Jiashi Li, Jin Chen, Jingyang Yuan, Junjie Qiu, Junxiao Song, Kai Dong, Kaige Gao, Kang Guan, Lean Wang, Lecong Zhang, Lei Xu, Leyi Xia, Liang Zhao, Liyue Zhang, Meng Li, Miaojun Wang, Mingchuan Zhang, Minghua Zhang, Minghui Tang, Mingming Li, Ning Tian, Panpan Huang, Peiyi Wang, Peng Zhang, Qihao Zhu, Qinyu Chen, Qiushi Du, R. J. Chen, R. L. Jin, Ruiqi Ge, Ruizhe Pan, Runxin Xu, Ruyi Chen, S. S. Li, Shanghao Lu, Shangyan Zhou, Shanhuang Chen, Shaoqing Wu, Shengfeng Ye, Shirong Ma, Shiyu Wang, Shuang Zhou, Shuiping Yu, Shunfeng Zhou, Size Zheng, T. Wang, Tian Pei, Tian Yuan, Tianyu Sun, W. L. Xiao, Wangding Zeng, Wei An, Wen Liu, Wenfeng Liang, Wenjun Gao, Wentao Zhang, X. Q. Li, Xiangyue Jin, Xianzu Wang, Xiao Bi, Xiaodong Liu, Xiaohan Wang, Xiaojin Shen, Xiaokang Chen, Xiaosha Chen, Xiaotao Nie, Xiaowen Sun, Xiaoxiang Wang, Xin Liu, Xin Xie, Xingkai Yu, Xinnan Song, Xinyi Zhou, Xinyu Yang, Xuan Lu, Xuecheng Su, Y. Wu, Y. K. Li, Y. X. Wei, Y. X. Zhu, Yanhong Xu, Yanping Huang, Yao Li, Yao Zhao, Yaofeng Sun, Yaohui Li, Yaohui Wang, Yi Zheng, Yichao Zhang, Yiliang Xiong, Yilong Zhao, Ying He, Ying Tang, Yishi Piao, Yixin Dong, Yixuan Tan, Yiyuan Liu, Yongji Wang, Yongqiang Guo, Yuchen Zhu, Yuduan Wang, Yuheng Zou, Yukun Zha, Yunxian Ma, Yuting Yan, Yuxiang You, Yuxuan Liu, Z. Z. Ren, Zehui Ren, Zhangli Sha, Zhe Fu, Zhen Huang, Zhen Zhang, Zhenda Xie, Zhewen Hao, Zhihong Shao, Zhiniu Wen, Zhipeng Xu, Zhongyu Zhang, Zhuoshu Li, Zihan Wang, Zihui Gu, Zilin Li, Ziwei Xie

MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation.

Language Modeling Language Modelling +2

A quantile-based nonadditive fixed effects model

no code implementations6 May 2024 Xin Liu

I propose a quantile-based nonadditive fixed effects panel model to study heterogeneous causal effects.

model quantile regression

Enhancing Micro Gesture Recognition for Emotion Understanding via Context-aware Visual-Text Contrastive Learning

1 code implementation3 May 2024 Deng Li, Bohao Xing, Xin Liu

In addition, instead of using handcrafted prompts for visual-text contrastive learning, we propose a novel module called Adaptive prompting to generate context-aware prompts.

Contrastive Learning Gesture Recognition +1

Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression

1 code implementation1 May 2024 Farima Fatahi Bayat, Xin Liu, H. V. Jagadish, Lu Wang

The adaptive nature of LITO counters the limitations of one-size-fits-all intervention methods, maximizing truthfulness by reflecting the model's internal knowledge only when it is confident.

Language Modeling Language Modelling +1

EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model

no code implementations1 May 2024 Deng Li, Xin Liu, Bohao Xing, Baiqiang Xia, Yuan Zong, Bihan Wen, Heikki Kälviäinen

In contrast, long sequential videos can reveal authentic emotions; 2) Previous studies commonly utilize various signals such as facial, speech, and even sensitive biological signals (e. g., electrocardiogram).

De-identification Emotion Recognition +3

NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding

1 code implementation21 Apr 2024 Chunkit Chan, Cheng Jiayang, Yauwai Yim, Zheye Deng, Wei Fan, Haoran Li, Xin Liu, Hongming Zhang, Weiqi Wang, Yangqiu Song

Large Language Models (LLMs) have sparked substantial interest and debate concerning their potential emergence of Theory of Mind (ToM) ability.

RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation

no code implementations18 Apr 2024 Chao Jin, Zili Zhang, Xuanlin Jiang, Fangyue Liu, Xin Liu, Xuanzhe Liu, Xin Jin

We implement RAGCache and evaluate it on vLLM, a state-of-the-art LLM inference system and Faiss, a state-of-the-art vector database.

RAG Retrieval +1

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

3 code implementations16 Apr 2024 Bin Ren, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang, Wei Zhai, Renjing Pei, Jiaming Guo, Songcen Xu, Yang Cao, ZhengJun Zha, Yan Wang, Yi Liu, Qing Wang, Gang Zhang, Liou Zhang, Shijie Zhao, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Xin Liu, Min Yan, Menghan Zhou, Yiqiang Yan, Yixuan Liu, Wensong Chan, Dehua Tang, Dong Zhou, Li Wang, Lu Tian, Barsoum Emad, Bohan Jia, Junbo Qiao, Yunshuai Zhou, Yun Zhang, Wei Li, Shaohui Lin, Shenglong Zhou, Binbin Chen, Jincheng Liao, Suiyi Zhao, Zhao Zhang, Bo wang, Yan Luo, Yanyan Wei, Feng Li, Mingshen Wang, Yawei Li, Jinhan Guan, Dehua Hu, Jiawei Yu, Qisheng Xu, Tao Sun, Long Lan, Kele Xu, Xin Lin, Jingtong Yue, Lehan Yang, Shiyi Du, Lu Qi, Chao Ren, Zeyu Han, YuHan Wang, Chaolin Chen, Haobo Li, Mingjun Zheng, Zhongbao Yang, Lianhong Song, Xingzhuo Yan, Minghan Fu, Jingyi Zhang, Baiang Li, Qi Zhu, Xiaogang Xu, Dan Guo, Chunle Guo, Jiadi Chen, Huanhuan Long, Chunjiang Duanmu, Xiaoyan Lei, Jie Liu, Weilin Jia, Weifeng Cao, Wenlong Zhang, Yanyu Mao, Ruilong Guo, Nihao Zhang, Qian Wang, Manoj Pandey, Maksym Chernozhukov, Giang Le, Shuli Cheng, Hongyuan Wang, Ziyan Wei, Qingting Tang, Liejun Wang, Yongming Li, Yanhui Guo, Hao Xu, Akram Khatami-Rizi, Ahmad Mahmoudi-Aznaveh, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi

In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking.

Image Super-Resolution

DEGNN: Dual Experts Graph Neural Network Handling Both Edge and Node Feature Noise

1 code implementation14 Apr 2024 Tai Hasegawa, Sukwon Yun, Xin Liu, Yin Jun Phua, Tsuyoshi Murata

Leveraging these modified representations, DEGNN subsequently addresses downstream tasks, ensuring robustness against noise present in both edges and node features of real-world graphs.

Graph Neural Network Graph structure learning +1

Future-Proofing Class Incremental Learning

no code implementations4 Apr 2024 Quentin Jodelet, Xin Liu, Yin Jun Phua, Tsuyoshi Murata

Exemplar-Free Class Incremental Learning is a highly challenging setting where replay memory is unavailable.

class-incremental learning Class Incremental Learning +2

EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs

1 code implementation30 Mar 2024 Cheng Jiayang, Lin Qiu, Chunkit Chan, Xin Liu, Yangqiu Song, Zheng Zhang

In this work, we propose an initial comprehensive framework called EventGround, which aims to tackle the problem of grounding free-texts to eventuality-centric KGs for contextualized narrative reasoning.

Graph Neural Network Knowledge Graphs +3

Convergence of Decentralized Stochastic Subgradient-based Methods for Nonsmooth Nonconvex functions

no code implementations18 Mar 2024 Siyuan Zhang, Nachuan Xiao, Xin Liu

In this paper, we focus on the decentralized stochastic subgradient-based methods in minimizing nonsmooth nonconvex functions without Clarke regularity, especially in the decentralized training of nonsmooth neural networks.

Generation is better than Modification: Combating High Class Homophily Variance in Graph Anomaly Detection

no code implementations15 Mar 2024 Rui Zhang, Dawei Cheng, Xin Liu, Jie Yang, Yi Ouyang, Xian Wu, Yefeng Zheng

We find that in graph anomaly detection, the homophily distribution differences between different classes are significantly greater than those in homophilic and heterophilic graphs.

Graph Anomaly Detection Graph Classification +2

Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study

1 code implementation15 Mar 2024 Chenguang Wang, Ruoxi Jia, Xin Liu, Dawn Song

We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark, especially under synthetic distribution shift and adversarial attacks.

Benchmarking

Answering Diverse Questions via Text Attached with Key Audio-Visual Clues

1 code implementation11 Mar 2024 Qilang Ye, Zitong Yu, Xin Liu

Audio-visual question answering (AVQA) requires reference to video content and auditory information, followed by correlating the question to predict the most precise answer.

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +3

Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge

1 code implementation11 Mar 2024 Yuting Zhang, Hao Lu, Xin Liu, Yingcong Chen, Kaishun Wu

Remote photoplethysmography (rPPG) is a promising technology that captures physiological signals from face videos, with potential applications in medical health, emotional computing, and biosecurity recognition.

Domain Generalization

ACT-MNMT Auto-Constriction Turning for Multilingual Neural Machine Translation

no code implementations11 Mar 2024 Shaojie Dai, Xin Liu, Ping Luo, Yue Yu

Large language model (LLM) has achieved promising performance in multilingual machine translation tasks through zero/few-shot prompts or prompt-tuning.

Language Modelling Large Language Model +2

Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack

no code implementations10 Mar 2024 Xin Liu, Yuxiang Zhang, Meng Wu, Mingyu Yan, Kun He, Wei Yan, Shirui Pan, Xiaochun Ye, Dongrui Fan

It can be categorized into two veins based on their effects on the performance of graph neural networks (GNNs), i. e., graph data augmentation and attack.

Data Augmentation Graph Neural Network

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

1 code implementation7 Mar 2024 Kaishen Yuan, Zitong Yu, Xin Liu, Weicheng Xie, Huanjing Yue, Jingyu Yang

Facial Action Units (AU) is a vital concept in the realm of affective computing, and AU detection has always been a hot research topic.

Facial Action Unit Detection Transfer Learning

Multi-modal Attribute Prompting for Vision-Language Models

no code implementations1 Mar 2024 Xin Liu, Jiamin Wu, and Wenfei Yang, Xu Zhou, Tianzhu Zhang

To address this issue, we propose a Multi-modal Attribute Prompting method (MAP) by jointly exploring textual attribute prompting, visual attribute prompting, and attribute-level alignment.

Attribute cross-modal alignment

A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection

1 code implementation29 Feb 2024 Chao Hao, Zitong Yu, Xin Liu, Jun Xu, Huanjing Yue, Jingyu Yang

Camouflaged object detection (COD) and salient object detection (SOD) are two distinct yet closely-related computer vision tasks widely studied during the past decades.

Object object-detection +2

FedUV: Uniformity and Variance for Heterogeneous Federated Learning

no code implementations CVPR 2024 Ha Min Son, Moon-Hyun Kim, Tai-Myoung Chung, Chao Huang, Xin Liu

Based on this finding, we introduce two regularization terms for local training to continuously emulate IID settings: (1) variance in the dimension-wise probability distribution of the classifier and (2) hyperspherical uniformity of representations of the encoder.

Federated Learning

Convergence Analysis of Split Federated Learning on Heterogeneous Data

no code implementations23 Feb 2024 Pengchao Han, Chao Huang, Geng Tian, Ming Tang, Xin Liu

We further extend the analysis to non-convex objectives and the scenario where some clients may be unavailable during training.

Federated Learning

Safety of Multimodal Large Language Models on Images and Texts

2 code implementations1 Feb 2024 Xin Liu, Yichen Zhu, Yunshi Lan, Chao Yang, Yu Qiao

In this paper, we systematically survey current efforts on the evaluation, attack, and defense of MLLMs' safety on images and text.

Survey

Fast Adversarial Training against Textual Adversarial Attacks

no code implementations23 Jan 2024 Yichen Yang, Xin Liu, Kun He

Based on the observation that the adversarial perturbations crafted by single-step and multi-step gradient ascent are similar, FAT uses single-step gradient ascent to craft adversarial examples in the embedding space to expedite the training process.

Adversarial Defense Adversarial Robustness

CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning

2 code implementations14 Jan 2024 Weiqi Wang, Tianqing Fang, Chunyang Li, Haochen Shi, Wenxuan Ding, Baixuan Xu, Zhaowei Wang, Jiaxin Bai, Xin Liu, Jiayang Cheng, Chunkit Chan, Yangqiu Song

The sequential process of conceptualization and instantiation is essential to generalizable commonsense reasoning as it allows the application of existing knowledge to unfamiliar scenarios.

Diversity

Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning

no code implementations1 Jan 2024 Honghao Wei, Xiyue Peng, Arnob Ghosh, Xin Liu

In theory, we demonstrate that when the actor employs a no-regret optimization oracle, WSAC achieves a number of guarantees: (i) For the first time in the safe offline RL setting, we establish that WSAC can produce a policy that outperforms any reference policy while maintaining the same level of safety, which is critical to designing a safe algorithm for offline RL.

continuous-control Continuous Control +2

Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation

1 code implementation25 Dec 2023 Jiaxin Bai, Yicheng Wang, Tianshi Zheng, Yue Guo, Xin Liu, Yangqiu Song

Although many applications require the use of knowledge for explanations, the utilization of abductive reasoning in conjunction with structured knowledge, such as a knowledge graph, remains largely unexplored.

Knowledge Graphs Logical Reasoning

Enhancing User Intent Capture in Session-Based Recommendation with Attribute Patterns

1 code implementation NeurIPS 2023 Xin Liu, Zheng Li, Yifan Gao, Jingfeng Yang, Tianyu Cao, Zhengyang Wang, Bing Yin, Yangqiu Song

The goal of session-based recommendation in E-commerce is to predict the next item that an anonymous user will purchase based on the browsing and purchase history.

Attribute Session-Based Recommendations

Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration

no code implementations22 Dec 2023 Honghao Wei, Xin Liu, Lei Ying

This paper studies safe Reinforcement Learning (safe RL) with linear function approximation and under hard instantaneous constraints where unsafe actions must be avoided at each step.

4k reinforcement-learning +1

AutoAugment Input Transformation for Highly Transferable Targeted Attacks

no code implementations21 Dec 2023 Haobo Lu, Xin Liu, Kun He

However, few of them are dedicated to input transformation. In this work, we observe a positive correlation between the logit/probability of the target class and diverse input transformation methods in targeted attacks.

Adversarial Attack

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models

2 code implementations29 Nov 2023 Xin Liu, Yichen Zhu, Jindong Gu, Yunshi Lan, Chao Yang, Yu Qiao

The security concerns surrounding Large Language Models (LLMs) have been extensively explored, yet the safety of Multimodal Large Language Models (MLLMs) remains understudied.

ALPHA: AnomaLous Physiological Health Assessment Using Large Language Models

1 code implementation21 Nov 2023 Jiankai Tang, Kegang Wang, Hongming Hu, Xiyuxing Zhang, Peiyu Wang, Xin Liu, Yuntao Wang

Our findings reveal that LLMs exhibit exceptional performance in determining medical indicators, including a Mean Absolute Error (MAE) of less than 1 beat per minute for heart rate and less than 1% for oxygen saturation (SpO2).

Heart rate estimation Specificity

From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models

1 code implementation21 Nov 2023 Zachary Englhardt, Chengqian Ma, Margaret E. Morris, Xuhai "Orson" Xu, Chun-Cheng Chang, Lianhui Qin, Daniel McDuff, Xin Liu, Shwetak Patel, Vikram Iyer

Passively collected behavioral health data from ubiquitous sensors holds significant promise to provide mental health professionals insights from patient's daily lives; however, developing analysis tools to use this data in clinical practice requires addressing challenges of generalization across devices and weak or ambiguous correlations between the measured signals and an individual's mental health.

Decision Making

AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph

1 code implementation15 Nov 2023 Zhaowei Wang, Haochen Shi, Weiqi Wang, Tianqing Fang, Hongming Zhang, Sehyun Choi, Xin Liu, Yangqiu Song

Cognitive research indicates that abstraction ability is essential in human intelligence, which remains under-explored in language models.

Benchmarking

4K-Resolution Photo Exposure Correction at 125 FPS with ~8K Parameters

1 code implementation15 Nov 2023 Yijie Zhou, Chao Li, Jin Liang, Tianyi Xu, Xin Liu, Jun Xu

The illumination of improperly exposed photographs has been widely corrected using deep convolutional neural networks or Transformers.

4k 8k +1

Training Robust Deep Physiological Measurement Models with Synthetic Video-based Data

no code implementations9 Nov 2023 Yuxuan Ou, Yuzhe Zhang, Yuntang Wang, Shwetak Patel, Daniel McDuf, Yuzhe Yang, Xin Liu

However, there exists a significant gap between synthetic and real-world data, which hinders the generalization of neural models trained on these synthetic datasets.

Diversity

PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models

1 code implementation7 Nov 2023 Haoran Li, Dadi Guo, Donghao Li, Wei Fan, Qi Hu, Xin Liu, Chunkit Chan, Duanyi Yao, Yuan YAO, Yangqiu Song

Lastly, PrivLM-Bench performs existing privacy attacks on LMs with pre-defined privacy objectives as the empirical evaluation results.

Privacy Preserving

IBADR: an Iterative Bias-Aware Dataset Refinement Framework for Debiasing NLU models

no code implementations1 Nov 2023 Xiaoyue Wang, Xin Liu, Lijie Wang, Yaoxiang Wang, Jinsong Su, Hua Wu

Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator.

Natural Language Understanding

Recaptured Raw Screen Image and Video Demoiréing via Channel and Spatial Modulations

1 code implementation NeurIPS 2023 Huanjing Yue, Yijia Cheng, Xin Liu, Jingyu Yang

The spatial modulation utilizes the feature with large receptive field to modulate the feature with small receptive field.

LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses

1 code implementation30 Oct 2023 Xin Liu, Muhammad Khalifa, Lu Wang

For evaluation, we construct CaT, a benchmark consisting of eight text generation tasks, covering responses ranging from short phrases to paragraphs.

Form Language Modeling +2

Gold: A Global and Local-aware Denoising Framework for Commonsense Knowledge Graph Noise Detection

1 code implementation18 Oct 2023 Zheye Deng, Weiqi Wang, Zhaowei Wang, Xin Liu, Yangqiu Song

Commonsense Knowledge Graphs (CSKGs) are crucial for commonsense reasoning, yet constructing them through human annotations can be costly.

Denoising Knowledge Graphs +1

On the Convergence of Federated Averaging under Partial Participation for Over-parameterized Neural Networks

no code implementations9 Oct 2023 Xin Liu, Wei Li, Dazhi Zhan, Yu Pan, Xin Ma, Yu Ding, Zhisong Pan

Federated learning (FL) is a widely employed distributed paradigm for collaboratively training machine learning models from multiple clients without sharing local data.

Federated Learning

ComSD: Balancing Behavioral Quality and Diversity in Unsupervised Skill Discovery

1 code implementation29 Sep 2023 Xin Liu, Yaran Chen, Dongbin Zhao

It contains a particle-based exploration reward to make agents access far-reaching states for exploratory skill acquisition, and a novel contrastive diversity reward to promote the discriminability between different skills.

Contrastive Learning Diversity +2

Self-Consistent Narrative Prompts on Abductive Natural Language Inference

1 code implementation15 Sep 2023 Chunkit Chan, Xin Liu, Tsz Ho Chan, Jiayang Cheng, Yangqiu Song, Ginny Wong, Simon See

However, the inter-sentential coherence and the model consistency have not been well exploited in the previous works on this task.

Language Modeling Language Modelling +1

Federated Linear Bandit Learning via Over-the-Air Computation

no code implementations25 Aug 2023 Jiali Wang, Yuning Jiang, Xin Liu, Ting Wang, Yuanming Shi

In this context, we propose a customized federated linear bandits scheme, where each device transmits an analog signal, and the server receives a superposition of these signals distorted by channel noise.

Video BagNet: short temporal receptive fields increase robustness in long-term action recognition

1 code implementation22 Aug 2023 Ombretta Strafforello, Xin Liu, Klamer Schutte, Jan van Gemert

Previous work on long-term video action recognition relies on deep 3D-convolutional models that have a large temporal receptive field (RF).

Action Recognition Temporal Action Localization

Federated Reinforcement Learning for Electric Vehicles Charging Control on Distribution Networks

no code implementations17 Aug 2023 Junkai Qian, Yuning Jiang, Xin Liu, Qing Wang, Ting Wang, Yuanming Shi, Wei Chen

To effectively learn the optimal EV charging control strategy, a federated deep reinforcement learning algorithm named FedSAC is further proposed.

Deep Reinforcement Learning reinforcement-learning

Cannot find the paper you are looking for? You can Submit a new open access paper.