Search Results for author: Rui Wang

Found 586 papers, 213 papers with code

Syntax in End-to-End Natural Language Processing

no code implementations EMNLP (ACL) 2021 Hai Zhao, Rui Wang, Kehai Chen

This tutorial surveys the latest technical progress of syntactic parsing and the role of syntax in end-to-end natural language processing (NLP) tasks, in which semantic role labeling (SRL) and machine translation (MT) are the representative NLP tasks that have always been beneficial from informative syntactic clues since a long time ago, though the advance from end-to-end deep learning models shows new results.

Machine Translation NMT +2

Unsupervised Paraphrasing Consistency Training for Low Resource Named Entity Recognition

no code implementations EMNLP 2021 Rui Wang, Ricardo Henao

Unsupervised consistency training is a way of semi-supervised learning that encourages consistency in model predictions between the original and augmented data.

Data Augmentation Low Resource Named Entity Recognition +5

Synchronous Refinement for Neural Machine Translation

no code implementations Findings (ACL) 2022 Kehai Chen, Masao Utiyama, Eiichiro Sumita, Rui Wang, Min Zhang

Machine translation typically adopts an encoder-to-decoder framework, in which the decoder generates the target sentence word-by-word in an auto-regressive manner.

Decoder Machine Translation +2

Few-Shot Class-Incremental Learning for Named Entity Recognition

no code implementations ACL 2022 Rui Wang, Tong Yu, Handong Zhao, Sungchul Kim, Subrata Mitra, Ruiyi Zhang, Ricardo Henao

In this work, we study a more challenging but practical problem, i. e., few-shot class-incremental learning for NER, where an NER model is trained with only few labeled samples of the new classes, without forgetting knowledge of the old ones.

class-incremental learning Few-Shot Class-Incremental Learning +4

Stacked AMR Parsing with Silver Data

1 code implementation Findings (EMNLP) 2021 Qingrong Xia, Zhenghua Li, Rui Wang, Min Zhang

In particular, one recent seq-to-seq work directly fine-tunes AMR graph sequences on the encoder-decoder pre-trained language model and achieves new state-of-the-art results, outperforming previous works by a large margin.

Abstract Meaning Representation AMR Parsing +3

Chinese Grammatical Error Diagnosis with Graph Convolution Network and Multi-task Learning

no code implementations AACL (NLP-TEA) 2020 Yikang Luo, Zuyi Bao, Chen Li, Rui Wang

For the correction subtask, we utilize the masked language model, the seq2seq model and the spelling check model to generate corrections based on the detection results.

Language Modeling Language Modelling +2

Estimating Q(s,s') with Deterministic Dynamics Gradients

no code implementations ICML 2020 Ashley Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski

In this paper, we introduce a novel form of a value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter.

Transfer Learning

Seed1.5-VL Technical Report

no code implementations11 May 2025 Dong Guo, Faming Wu, Feida Zhu, Fuxing Leng, Guang Shi, Haobin Chen, Haoqi Fan, Jian Wang, Jianyu Jiang, Jiawei Wang, Jingji Chen, Jingjia Huang, Kang Lei, Liping Yuan, Lishu Luo, PengFei Liu, Qinghao Ye, Rui Qian, Shen Yan, Shixiong Zhao, Shuai Peng, Shuangye Li, Sihang Yuan, Sijin Wu, Tianheng Cheng, Weiwei Liu, Wenqian Wang, Xianhan Zeng, Xiao Liu, Xiaobo Qin, Xiaohan Ding, Xiaojun Xiao, Xiaoying Zhang, Xuanwei Zhang, Xuehan Xiong, Yanghua Peng, Yangrui Chen, Yanwei Li, Yanxu Hu, Yi Lin, Yiyuan Hu, Yiyuan Zhang, Youbin Wu, Yu Li, Yudong Liu, Yue Ling, Yujia Qin, Zanbo Wang, Zhiwu He, Aoxue Zhang, Bairen Yi, Bencheng Liao, Can Huang, Can Zhang, Chaorui Deng, Chaoyi Deng, Cheng Lin, Cheng Yuan, Chenggang Li, Chenhui Gou, Chenwei Lou, Chengzhi Wei, Chundian Liu, Chunyuan Li, Deyao Zhu, Donghong Zhong, Feng Li, Feng Zhang, Gang Wu, Guodong Li, Guohong Xiao, Haibin Lin, Haihua Yang, Haoming Wang, Heng Ji, Hongxiang Hao, Hui Shen, Huixia Li, Jiahao Li, Jialong Wu, Jianhua Zhu, Jianpeng Jiao, Jiashi Feng, Jiaze Chen, Jianhui Duan, Jihao Liu, Jin Zeng, Jingqun Tang, Jingyu Sun, Joya Chen, Jun Long, Junda Feng, Junfeng Zhan, Junjie Fang, Junting Lu, Kai Hua, Kai Liu, Kai Shen, Kaiyuan Zhang, Ke Shen, Ke Wang, Keyu Pan, Kun Zhang, Kunchang Li, Lanxin Li, Lei LI, Lei Shi, Li Han, Liang Xiang, Liangqiang Chen, Lin Chen, Lin Li, Lin Yan, Liying Chi, Longxiang Liu, Mengfei Du, Mingxuan Wang, Ningxin Pan, Peibin Chen, Pengfei Chen, Pengfei Wu, Qingqing Yuan, Qingyao Shuai, Qiuyan Tao, Renjie Zheng, Renrui Zhang, Ru Zhang, Rui Wang, Rui Yang, Rui Zhao, Shaoqiang Xu, Shihao Liang, Shipeng Yan, Shu Zhong, Shuaishuai Cao, Shuangzhi Wu, Shufan Liu, Shuhan Chang, Songhua Cai, Tenglong Ao, Tianhao Yang, Tingting Zhang, Wanjun Zhong, Wei Jia, Wei Weng, Weihao Yu, Wenhao Huang, Wenjia Zhu, Wenli Yang, Wenzhi Wang, Xiang Long, XiangRui Yin, Xiao Li, Xiaolei Zhu, Xiaoying Jia, Xijin Zhang, Xin Liu, Xinchen Zhang, Xinyu Yang, Xiongcai Luo, Xiuli Chen, Xuantong Zhong, Xuefeng Xiao, Xujing Li, Yan Wu, Yawei Wen, Yifan Du, Yihao Zhang, Yining Ye, Yonghui Wu, Yu Liu, Yu Yue, Yufeng Zhou, Yufeng Yuan, Yuhang Xu, Yuhong Yang, Yun Zhang, Yunhao Fang, Yuntao Li, Yurui Ren, Yuwen Xiong, Zehua Hong, Zehua Wang, Zewei Sun, Zeyu Wang, Zhao Cai, Zhaoyue Zha, Zhecheng An, Zhehui Zhao, Zhengzhuo Xu, Zhipeng Chen, Zhiyong Wu, Zhuofan Zheng, ZiHao Wang, Zilong Huang, Ziyu Zhu, Zuquan Song

We present Seed1. 5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning.

Mixture-of-Experts Multimodal Reasoning +1

OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval

no code implementations10 May 2025 Wei Yang, Jingjing Fu, Rui Wang, Jinyu Wang, Lei Song, Jiang Bian

The effectiveness of Vision-language RAG systems hinges on multimodal retrieval, which is inherently challenging due to the diverse modalities and knowledge granularities in both queries and knowledge bases.

Cross-Modal Retrieval Question Answering +4

CachePrune: Neural-Based Attribution Defense Against Indirect Prompt Injection Attacks

no code implementations29 Apr 2025 Rui Wang, Junda Wu, Yu Xia, Tong Yu, Ruiyi Zhang, Ryan Rossi, Lina Yao, Julian McAuley

In this paper, we propose CachePrune that defends against this attack by identifying and pruning task-triggering neurons from the KV cache of the input prompt context.

Instruction Following

Fast-Slow Co-advancing Optimizer: Toward Harmonious Adversarial Training of GAN

no code implementations21 Apr 2025 Lin Wang, Xiancheng Wang, Rui Wang, Zhibo Zhang, Minghang Zhao

Up to now, the training processes of typical Generative Adversarial Networks (GANs) are still particularly sensitive to data properties and hyperparameters, which may lead to severe oscillations, difficulties in convergence, or even failures to converge, especially when the overall variances of the training sets are large.

SST: Self-training with self-adaptive thresholding for semi-supervised learning

no code implementations Information Processing & Management 2025 Shuai Zhao, Heyan Huang, Xinge Li, Xiaokang Chen, Rui Wang

Semi-supervised learning (SSL) offers a solution to this problem by utilizing a small amount of labeled data along with a large volume of unlabeled data.

 Ranked #1 on Image Classification on Clothing1M (using extra training data)

Semi-Supervised Image Classification

Low-Overhead Channel Estimation Framework for Beyond Diagonal Reconfigurable Intelligent Surface Assisted Multi-User MIMO Communication

no code implementations15 Apr 2025 Rui Wang, Shuowen Zhang, Bruno Clerckx, Liang Liu

However, the number of coefficients in the cascaded channels to be estimated in BD-RIS assisted systems is significantly larger than that in conventional RIS assisted systems, because the channels associated with the off-diagonal elements of the scattering matrix have to be estimated as well.

OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding

no code implementations15 Apr 2025 Dianbing Xi, Jiepeng Wang, Yuanzhi Liang, Xi Qiu, Yuchi Huo, Rui Wang, Chi Zhang, Xuelong Li

In this paper, we propose a novel framework for controllable video diffusion, OmniVDiff, aiming to synthesize and comprehend multiple video visual content in a single diffusion model.

Semantic Segmentation Video Generation +1

SkyReels-A2: Compose Anything in Video Diffusion Transformers

1 code implementation3 Apr 2025 Zhengcong Fei, Debang Li, Di Qiu, Jiahua Wang, Yikun Dou, Rui Wang, Jingtao Xu, Mingyuan Fan, Guibin Chen, Yang Li, Yahui Zhou

This paper presents SkyReels-A2, a controllable video generation framework capable of assembling arbitrary visual elements (e. g., characters, objects, backgrounds) into synthesized videos based on textual prompts while maintaining strict consistency with reference images for each element.

Video Generation

Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1

1 code implementation31 Mar 2025 Yi Chen, Yuying Ge, Rui Wang, Yixiao Ge, Lu Qiu, Ying Shan, Xihui Liu

Multimodal Large Language Models (MLLMs) inherit this reasoning potential but remain underexplored in tasks requiring both perception and logical reasoning.

Logical Reasoning Multiple-choice +2

Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

1 code implementation31 Mar 2025 Rui Wang, Hongru Wang, Boyang Xue, Jianhui Pang, Shudong Liu, Yi Chen, Jiahao Qiu, Derek Fai Wong, Heng Ji, Kam-Fai Wong

Recent advancements in Large Language Models (LLMs) have significantly enhanced their ability to perform complex reasoning tasks, transitioning from fast and intuitive thinking (System 1) to slow and deep reasoning (System 2).

PlatMetaX: An Integrated MATLAB platform for Meta-Black-Box Optimization

1 code implementation26 Mar 2025 Xu Yang, Rui Wang, Kaiwen Li, Wenhua Li, Tao Zhang, Fujun He

The landscape of optimization problems has become increasingly complex, necessitating the development of advanced optimization techniques.

Meta-Learning

RoCA: Robust Contrastive One-class Time Series Anomaly Detection with Contaminated Data

1 code implementation24 Mar 2025 Xudong Mou, Rui Wang, Bo Li, Tianyu Wo, Jie Sun, Hui Wang, Xudong Liu

It fuses the separated assumptions of one-class classification and contrastive learning in a single training process to characterize a more complete so-called normality.

Contrastive Learning One-Class Classification +2

Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique

no code implementations21 Mar 2025 Yansi Li, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Qiuzhi Liu, Rui Wang, Zhuosheng Zhang, Zhaopeng Tu, Haitao Mi, Dong Yu

Traditional inference time scaling methods utilize scalar reward signals from process reward models to evaluate candidate reasoning steps, but these scalar rewards lack the nuanced qualitative information essential for understanding and justifying each step.

Decision Making

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

no code implementations20 Mar 2025 Quanhao Li, Zhen Xing, Rui Wang, HUI ZHANG, Qi Dai, Zuxuan Wu

However, existing methods struggle with complex object movements and multi-object motion control, resulting in imprecise trajectory adherence, poor object consistency, and compromised visual quality.

Image to Video Generation Object

Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data

no code implementations20 Mar 2025 Zijian Li, Jingjing Fu, Lei Song, Jiang Bian, Jun Zhang, Rui Wang

Employing \textit{CoF}, we construct the \textit{ChartCoF} dataset, with 1. 4k complex reasoning Q\&A for fine-grained analysis and 50k Q\&A for reasoning enhancement.

Diversity Visual Reasoning

DAST: Difficulty-Aware Self-Training on Large Language Models

no code implementations12 Mar 2025 Boyang Xue, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Hongling Xu, Fei Mi, Yasheng Wang, Lifeng Shang, Qun Liu, Kam-Fai Wong

Present Large Language Models (LLM) self-training methods always under-sample on challenging queries, leading to inadequate learning on difficult problems which limits LLMs' ability.

Data Augmentation

Self-Consistent Equation-guided Neural Networks for Censored Time-to-Event Data

1 code implementation12 Mar 2025 Sehwan Kim, Rui Wang, Wenbin Lu

In survival analysis, estimating the conditional survival function given predictors is often of interest.

Deep Learning Survival Analysis

Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation

no code implementations10 Mar 2025 Zhi Qin, Qianhui Gui, Mouxiao Bian, Rui Wang, Hong Ge, Dandan Yao, Ziying Sun, Yuan Zhao, Yu Zhang, Hui Shi, Dongdong Wang, Chenxin Song, Shenghong Ju, Lihao Liu, Junjun He, Jie Xu, Yuan-Cheng Wang

To address this challenge, in this study, we establish a standardized dataset and evaluation framework for medical imaging QC, systematically assessing large language models (LLMs) in image quality assessment and report standardization.

Image Quality Assessment

Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model

1 code implementation8 Mar 2025 Mingxing Li, Rui Wang, Lei Sun, Yancheng Bai, Xiangxiang Chu

The rapid expansion of mobile internet has resulted in a substantial increase in user-generated content (UGC) images, thereby making the thorough assessment of UGC images both urgent and essential.

Image Quality Assessment Language Modeling +6

SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers

1 code implementation15 Feb 2025 Di Qiu, Zhengcong Fei, Rui Wang, Jialin Bai, Changqian Yu, Mingyuan Fan, Guibin Chen, Xiang Wen

We present SkyReels-A1, a simple yet effective framework built upon video diffusion Transformer to facilitate portrait image animation.

Image Animation Portrait Animation

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

3 code implementations14 Feb 2025 Guoqing Ma, Haoyang Huang, Kun Yan, Liangyu Chen, Nan Duan, Shengming Yin, Changyi Wan, Ranchen Ming, Xiaoniu Song, Xing Chen, Yu Zhou, Deshan Sun, Deyu Zhou, Jian Zhou, Kaijun Tan, Kang An, Mei Chen, Wei Ji, Qiling Wu, Wen Sun, Xin Han, Yanan Wei, Zheng Ge, Aojie Li, Bin Wang, Bizhu Huang, Bo wang, Brian Li, Changxing Miao, Chen Xu, Chenfei Wu, Chenguang Yu, Dapeng Shi, Dingyuan Hu, Enle Liu, Gang Yu, Ge Yang, Guanzhe Huang, Gulin Yan, Haiyang Feng, Hao Nie, Haonan Jia, Hanpeng Hu, Hanqi Chen, Haolong Yan, Heng Wang, Hongcheng Guo, Huilin Xiong, Huixin Xiong, Jiahao Gong, Jianchang Wu, Jiaoren Wu, Jie Wu, Jie Yang, Jiashuai Liu, Jiashuo Li, Jingyang Zhang, Junjing Guo, Junzhe Lin, Kaixiang Li, Lei Liu, Lei Xia, Liang Zhao, Liguo Tan, Liwen Huang, Liying Shi, Ming Li, Mingliang Li, Muhua Cheng, Na Wang, Qiaohui Chen, Qinglin He, Qiuyan Liang, Quan Sun, Ran Sun, Rui Wang, Shaoliang Pang, Shiliang Yang, Sitong Liu, SiQi Liu, Shuli Gao, Tiancheng Cao, Tianyu Wang, Weipeng Ming, Wenqing He, Xu Zhao, Xuelin Zhang, Xianfang Zeng, Xiaojia Liu, Xuan Yang, Yaqi Dai, Yanbo Yu, Yang Li, Yineng Deng, Yingming Wang, Yilei Wang, Yuanwei Lu, Yu Chen, Yu Luo, Yuchu Luo, Yuhe Yin, Yuheng Feng, Yuxiang Yang, Zecheng Tang, Zekai Zhang, Zidong Yang, Binxing Jiao, Jiansheng Chen, Jing Li, Shuchang Zhou, Xiangyu Zhang, Xinhao Zhang, Yibo Zhu, Heung-Yeung Shum, Daxin Jiang

We present Step-Video-T2V, a state-of-the-art text-to-video pre-trained model with 30B parameters and the ability to generate videos up to 204 frames in length.

Video Generation Video Reconstruction

Graph Federated Learning Based Proactive Content Caching in Edge Computing

no code implementations7 Feb 2025 Rui Wang

With the rapid growth of mobile data traffic and the increasing prevalence of video streaming, proactive content caching in edge computing has become crucial for reducing latency and alleviating network congestion.

Edge-computing Federated Learning

From Uncertain to Safe: Conformal Fine-Tuning of Diffusion Models for Safe PDE Control

no code implementations4 Feb 2025 Peiyan Hu, Xiaowei Qian, Wenhao Deng, Rui Wang, Haodong Feng, Ruiqi Feng, Tao Zhang, Long Wei, Yue Wang, Zhi-Ming Ma, Tailin Wu

To address this limitation, we propose Safe Diffusion Models for PDE Control (SafeDiffCon), which introduce the uncertainty quantile as model uncertainty quantification to achieve optimal control under safety constraints through both post-training and inference phases.

Conformal Prediction Uncertainty Quantification

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

no code implementations30 Jan 2025 Yue Wang, Qiuzhi Liu, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Linfeng Song, Dian Yu, Juntao Li, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu

To address underthinking, we propose a decoding strategy with thought switching penalty TIP that discourages premature transitions between thoughts, encouraging deeper exploration of each reasoning path.

All

Towards Robust Incremental Learning under Ambiguous Supervision

no code implementations23 Jan 2025 Rui Wang, Mingxuan Xia, Chang Yao, Lei Feng, Junbo Zhao, Gang Chen, Haobo Wang

Technically, we develop the Prototype-Guided Disambiguation and Replay Algorithm (PGDR) which leverages the class prototypes as a proxy to mitigate two intertwined challenges in IPLL, i. e., label ambiguity and catastrophic forgetting.

Incremental Learning Partial Label Learning +1

Do Large Language Models Truly Understand Geometric Structures?

1 code implementation23 Jan 2025 XiaoFeng Wang, Yiming Wang, Wenhong Zhu, Rui Wang

Geometric ability is a significant challenge for large language models (LLMs) due to the need for advanced spatial comprehension and abstract thinking.

Reinforcement learning Based Automated Design of Differential Evolution Algorithm for Black-box Optimization

no code implementations22 Jan 2025 Xu Yang, Rui Wang, Kaiwen Li, Ling Wang

To address this challenge, we introduce a novel framework that employs reinforcement learning (RL) to automatically design DE for black-box optimization through meta-learning.

Evolutionary Algorithms Meta-Learning +1

PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation

1 code implementation20 Jan 2025 Jinyu Wang, Jingjing Fu, Rui Wang, Lei Song, Jiang Bian

Despite notable advancements in Retrieval-Augmented Generation (RAG) systems that expand large language model (LLM) capabilities through external retrieval, these systems often struggle to meet the complex and diverse needs of real-world industrial applications.

Language Modeling Language Modelling +3

A Dynamic Improvement Framework for Vehicular Task Offloading

no code implementations20 Jan 2025 Qianren Li, Yuncong Hong, Bojie Lv, Rui Wang

In this paper, the task offloading from vehicles with random velocities is optimized via a novel dynamic improvement framework.

Scheduling

FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models

no code implementations18 Jan 2025 Xinglin Pan, WenXiang Lin, Lin Zhang, Shaohuai Shi, Zhenheng Tang, Rui Wang, Bo Li, Xiaowen Chu

To enable versatile usage of MoE models, we introduce FSMoE, a flexible training system optimizing task scheduling with three novel techniques: 1) Unified abstraction and online profiling of MoE modules for task scheduling across various MoE implementations.

Mixture-of-Experts Scheduling

Neural Honeytrace: A Robust Plug-and-Play Watermarking Framework against Model Extraction Attacks

1 code implementation16 Jan 2025 Yixiao Xu, Binxing Fang, Rui Wang, Yinghai Zhou, Shouling Ji, YuAn Liu, Mohan Li, Zhihong Tian

Guided by the model, we further introduce: (1) a similarity-based training-free watermarking method for plug-and-play and flexible watermarking, and (2) a distribution-based multi-step watermark information transmission strategy for robust watermarking.

Model extraction

How Large is the Universe of RNA-Like Motifs? A Clustering Analysis of RNA Graph Motifs Using Topological Descriptors

1 code implementation8 Jan 2025 Rui Wang, Tamar Schlick

To determine which dual graphs in the 99. 89% hypothetical set are more likely to be associated with RNA structures, we apply computational topology descriptors using the Persistent Spectral Graphs (PSG) method to characterize each graph using 19 PSG-based features and use clustering algorithms that partition all possible dual graphs into two clusters, RNA-like cluster and non-RNA-like cluster.

RAG

M2I2: Learning Efficient Multi-Agent Communication via Masked State Modeling and Intention Inference

no code implementations31 Dec 2024 Chuxiong Sun, Peng He, Qirui Ji, Zehua Zang, Jiangmeng Li, Rui Wang, Wei Wang

To address this issue, we introduce M2I2, a novel framework designed to enhance the agents' capabilities to assimilate and utilize received information effectively.

Meta-Learning

An Experimental Study of Passive UAV Tracking with Digital Arrays and Cellular Downlink Signals

no code implementations30 Dec 2024 Yifei Sun, Chao Yu, Yan Luo, Tony Xiao Han, Haisheng Tan, Rui Wang, Francis C. M. Lau

Given the prospects of the low-altitude economy (LAE) and the popularity of unmanned aerial vehicles (UAVs), there are increasing demands on monitoring flying objects at low altitude in wide urban areas.

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

no code implementations30 Dec 2024 Xingyu Chen, Jiahao Xu, Tian Liang, Zhiwei He, Jianhui Pang, Dian Yu, Linfeng Song, Qiuzhi Liu, Mengfei Zhou, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu

The remarkable performance of models like the OpenAI o1 can be attributed to their ability to emulate human-like long-time thinking during inference.

GSM8K

RobotDiffuse: Motion Planning for Redundant Manipulator based on Diffusion Model

1 code implementation27 Dec 2024 Xiaohan Zhang, Xudong Mou, Rui Wang, Tianyu Wo, Ningbo Gu, Tiejun Wang, Cangbai Xu, Xudong Liu

This paper introduces RobotDiffuse, a diffusion model-based approach for motion planning in redundant manipulators.

Acrobot Motion Planning

Multi-UAV Collaborative Trajectory Planning for Seamless Data Collection and Transmission

no code implementations17 Dec 2024 Rui Wang, Kaitao Meng, Deshi Li

Specifically, the mission completion time is minimized by optimizing the trajectories, task allocation, collection time scheduling, and transmission topology of UAVs while ensuring backhaul link to the BS.

Scheduling Trajectory Planning

UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models

1 code implementation16 Dec 2024 Boyang Xue, Fei Mi, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Erxin Yu, Xuming Hu, Kam-Fai Wong

Despite demonstrating impressive capabilities, Large Language Models (LLMs) still often struggle to accurately express the factual knowledge they possess, especially in cases where the LLMs' knowledge boundaries are ambiguous.

Question Answering

StarWhisper Telescope: Agent-Based Observation Assistant System to Approach AI Astrophysicist

2 code implementations9 Dec 2024 Cunshi Wang, Xinjie Hu, Yu Zhang, Xunhao Chen, Pengliang Du, Yiming Mao, Rui Wang, Yuyang Li, Ying Wu, Hang Yang, Yansong Li, Beichuan Wang, Haiyang Mu, Zheng Wang, Jianfeng Tian, Liang Ge, Yongna Mao, Shengming Li, Xiaomeng Lu, Jinhang Zou, Yang Huang, Ningchen Sun, Jie Zheng, Min He, Yu Bai, Junjie Jin, Hong Wu, Jifeng Liu

Additionally, the integration of AI agents within the system provides online accessibility, saving astronomers' time and encouraging greater participation from amateur astronomers in the NGSS project.

Wavelet Diffusion Neural Operator

1 code implementation6 Dec 2024 Peiyan Hu, Rui Wang, Xiang Zheng, Tao Zhang, Haodong Feng, Ruiqi Feng, Long Wei, Yue Wang, Zhi-Ming Ma, Tailin Wu

Recently, diffusion generative models have emerged as a competitive class of methods for these tasks due to their ability to capture long-term dependencies and model high-dimensional states.

LADDER: Multi-objective Backdoor Attack via Evolutionary Algorithm

no code implementations28 Nov 2024 Dazhuang Liu, Yanqi Qiao, Rui Wang, Kaitai Liang, Georgios Smaragdakis

Current black-box backdoor attacks in convolutional neural networks formulate attack objective(s) as single-objective optimization problems in single domain.

Backdoor Attack

Draft Model Knows When to Stop: A Self-Verification Length Policy for Speculative Decoding

1 code implementation27 Nov 2024 Ziyin Zhang, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Rui Wang, Zhaopeng Tu

Speculative Decoding (SD) has become an important technique in accelerating the inference speed of large language models.

8k

SPDFusion: An Infrared and Visible Image Fusion Network Based on a Non-Euclidean Representation of Riemannian Manifolds

no code implementations16 Nov 2024 Huan Kang, Hui Li, Tianyang Xu, Rui Wang, Xiao-Jun Wu, Josef Kittler

Euclidean representation learning methods have achieved commendable results in image fusion tasks, which can be attributed to their clear advantages in handling with linear space.

Infrared And Visible Image Fusion Representation Learning

OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine Learning Tasks in Logic Synthesis

2 code implementations14 Nov 2024 Liwei Ni, Rui Wang, Miao Liu, Xingyu Meng, Xiaoze Lin, Junfeng Liu, Guojie Luo, Zhufei Chu, Weikang Qian, Xiaoyan Yang, Biwei Xie, Xingquan Li, Huawei Li

This paper introduces OpenLS-DGF, an adaptive logic synthesis dataset generation framework, to enhance machine learning~(ML) applications within the logic synthesis process.

Dataset Generation

Metric Learning for Tag Recommendation: Tackling Data Sparsity and Cold Start Issues

no code implementations10 Nov 2024 Yuanshuai Luo, Rui Wang, Yaxin Liang, Ankai Liang, Wenyi Liu

With the rapid growth of digital information, personalized recommendation systems have become an indispensable part of Internet services, especially in the fields of e-commerce, social media, and online entertainment.

Collaborative Filtering Metric Learning +2

Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning

no code implementations5 Nov 2024 Bei Li, Tong Zheng, Rui Wang, Jiahao Liu, Qingyan Guo, Junliang Guo, Xu Tan, Tong Xiao, Jingbo Zhu, Jingang Wang, Xunliang Cai

First, we introduce a predictor-corrector learning framework to minimize truncation errors, which consists of a high-order predictor and a multistep corrector.

Abstractive Text Summarization Language Modeling +4

EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture Models

1 code implementation30 Oct 2024 Shangquan Sun, Wenqi Ren, Zikun Liu, Hyunhee Park, Rui Wang, Xiaochun Cao

We estimate the range-wise ensemble weights on a reference set and store them in a lookup table (LUT) for efficient ensemble inference on the test set.

Deblurring Ensemble Learning +3

MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis

no code implementations28 Oct 2024 Di Qiu, Zheng Chen, Rui Wang, Mingyuan Fan, Changqian Yu, Junshi Huang, Xiang Wen

Recent advancements in character video synthesis still depend on extensive fine-tuning or complex 3D modeling processes, which can restrict accessibility and hinder real-time applicability.

Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model

no code implementations24 Oct 2024 Wenhong Zhu, Zhiwei He, XiaoFeng Wang, PengFei Liu, Rui Wang

Aligning language models (LMs) with human preferences has become a key area of research, enabling these models to meet diverse user needs better.

A Recommendation Model Utilizing Separation Embedding and Self-Attention for Feature Mining

no code implementations19 Oct 2024 Wenyi Liu, Rui Wang, Yuanshuai Luo, Jianjun Wei, Zihao Zhao, Junming Huang

With the explosive growth of Internet data, users are facing the problem of information overload, which makes it a challenge to efficiently obtain the required resources.

Click-Through Rate Prediction feature selection +2

CAPE: A Chinese Dataset for Appraisal-based Emotional Generation using Large Language Models

no code implementations18 Oct 2024 June M. Liu, He Cao, Renliang Sun, Rui Wang, Yu Li, Jiaxing Zhang

Generating emotionally appropriate responses in conversations with large language models presents a significant challenge due to the complexities of human emotions and cognitive processes, which remain largely underexplored in their critical role in social interactions.

Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation

1 code implementation17 Oct 2024 Yiming Wang, Pei Zhang, Baosong Yang, Derek F. Wong, Rui Wang

LLM self-evaluation relies on the LLM's own ability to estimate response correctness, which can greatly improve its deployment reliability.

MlingConf: A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models

1 code implementation16 Oct 2024 Boyang Xue, Hongru Wang, Rui Wang, Sheng Wang, Zezhong Wang, Yiming Du, Bin Liang, Kam-Fai Wong

This paper addresses this gap by introducing a comprehensive investigation of Multilingual Confidence estimation (MlingConf) on LLMs, focusing on both language-agnostic (LA) and language-specific (LS) tasks to explore the performance and language dominance effects of multilingual confidence estimations on different tasks.

Balancing Innovation and Privacy: Data Security Strategies in Natural Language Processing Applications

no code implementations11 Oct 2024 Shaobo Liu, Guiran Liu, Binrong Zhu, Yuanshuai Luo, Linxiao Wu, Rui Wang

By introducing a differential privacy mechanism, our model ensures the accuracy and reliability of data analysis results while adding random noise.

Computational Efficiency Machine Translation +1

AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction

1 code implementation10 Oct 2024 Hongru Wang, Rui Wang, Boyang Xue, Heming Xia, Jingtao Cao, Zeming Liu, Jeff Z. Pan, Kam-Fai Wong

In this paper, we introduce \texttt{AppBench}, the first benchmark to evaluate LLMs' ability to plan and execute multiple APIs from various sources in order to complete the user's task.

In-Context Learning

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

no code implementations7 Oct 2024 Deqing Fu, Tong Xiao, Rui Wang, Wang Zhu, Pengchuan Zhang, Guan Pang, Robin Jia, Lawrence Chen

Although reward models have been successful in improving multimodal large language models, the reward models themselves remain brutal and contain minimal information.

Hallucination Hallucination Evaluation

Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning

no code implementations3 Oct 2024 Tianxiang Hu, Pei Zhang, Baosong Yang, Jun Xie, Derek F. Wong, Rui Wang

Achieving consistent high-quality machine translation (MT) across diverse domains remains a significant challenge, primarily due to the limited and imbalanced parallel training data available in various domains.

Benchmarking Language Modeling +4

Segment as You Wish -- Free-Form Language-Based Segmentation for Medical Images

no code implementations2 Oct 2024 Longchao Da, Rui Wang, Xiaojian Xu, Parminder Bhatia, Taha Kass-Hout, Hua Wei, Cao Xiao

Medical imaging is crucial for diagnosing a patient's health condition, and accurate segmentation of these images is essential for isolating regions of interest to ensure precise diagnosis and treatment planning.

Anatomy Form +6

AlignSum: Data Pyramid Hierarchical Fine-tuning for Aligning with Human Summarization Preference

1 code implementation1 Oct 2024 Yang Han, Yiming Wang, Rui Wang, Lu Chen, Kai Yu

This demonstrates that AlignSum significantly enhances the alignment of language models with human summarization preferences.

Text Summarization

Trajectory Tracking for MmWave Communication Systems via Cooperative Passive Sensing

no code implementations30 Sep 2024 Chao Yu, Bojie Lv, Haoyu Qiu, Rui Wang

Specifically, in the uplink communication with at least two transmitters, a cooperative detection method is proposed for the receiver to track the blocker's trajectory, localize the transmitters and detect the potential link blockage jointly.

RMLR: Extending Multinomial Logistic Regression into General Geometries

2 code implementations28 Sep 2024 Ziheng Chen, Yue Song, Rui Wang, XiaoJun Wu, Nicu Sebe

Specifically, we showcase our framework on the Symmetric Positive Definite (SPD) manifold and special orthogonal group, i. e., the set of rotation matrices.

regression

Harnessing Diversity for Important Data Selection in Pretraining Large Language Models

no code implementations25 Sep 2024 Chi Zhang, Huaping Zhong, Kuan Zhang, Chengliang Chai, Rui Wang, Xinlin Zhuang, Tianyi Bai, Jiantao Qiu, Lei Cao, Ju Fan, Ye Yuan, Guoren Wang, Conghui He

For each cluster, if we opt to select data from it, we take some samples to evaluate the influence to prevent processing all instances.

Diversity

MADial-Bench: Towards Real-world Evaluation of Memory-Augmented Dialogue Generation

no code implementations23 Sep 2024 Junqing He, Liang Zhu, Rui Wang, Xi Wang, Reza Haffari, Jiaxing Zhang

Long-term memory is important for chatbots and dialogue systems (DS) to create consistent and human-like conversations, evidenced by numerous developed memory-augmented DS (MADS).

Dialogue Generation Retrieval

Graph Neural Network Framework for Sentiment Analysis Using Syntactic Feature

no code implementations21 Sep 2024 Linxiao Wu, Yuanshuai Luo, Binrong Zhu, Guiran Liu, Rui Wang, Qian Yu

Amidst the swift evolution of social media platforms and e-commerce ecosystems, the domain of opinion mining has surged as a pivotal area of exploration within natural language processing.

Graph Neural Network Opinion Mining +1

Context-Conditioned Spatio-Temporal Predictive Learning for Reliable V2V Channel Prediction

no code implementations16 Sep 2024 Lei Chu, Daoud Burghal, Rui Wang, Michael Neuman, Andreas F. Molisch

Achieving reliable multidimensional Vehicle-to-Vehicle (V2V) channel state information (CSI) prediction is both challenging and crucial for optimizing downstream tasks that depend on instantaneous CSI.

Meta-Learning

Untie the Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models

no code implementations7 Sep 2024 Junfeng Tian, Da Zheng, Yang Cheng, Rui Wang, Colin Zhang, Debing Zhang

Large language models (LLM) have prioritized expanding the context window from which models can incorporate more information.

Data Augmentation

GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding

no code implementations6 Sep 2024 Ziyin Zhang, Hang Yu, Shijie Li, Peng Di, Jianguo Li, Rui Wang

Programming languages possess rich semantic information such as data flow that is represented by graphs and not available from the surface form of source code.

cross-modal alignment Language Modelling +1

Fast Capacity Estimation in Ultra-dense Wireless Networks with Random Interference

no code implementations31 Aug 2024 Dandan Jiang, Rui Wang, Jiang Xue

In ultra-dense wireless networks with extremely large numbers of base stations (BSs) and users, we provide fast estimation methods for determining the capacity.

Capacity Estimation

A Hybrid Transformer-Mamba Network for Single Image Deraining

1 code implementation31 Aug 2024 Shangquan Sun, Wenqi Ren, Juxiang Zhou, Jianhou Gan, Rui Wang, Xiaochun Cao

Existing deraining Transformers employ self-attention mechanisms with fixed-range windows or along channel dimensions, limiting the exploitation of non-local receptive fields.

Mamba Single Image Deraining

MakeupAttack: Feature Space Black-box Backdoor Attack on Face Recognition via Makeup Transfer

1 code implementation22 Aug 2024 Ming Sun, Lihua Jing, Zixuan Zhu, Rui Wang

Furthermore, due to the perceptibility, diversity, and similarity of facial datasets, many state-of-the-art backdoor attacks lose effectiveness on face recognition tasks.

Backdoor Attack Diversity +1

Fostering Natural Conversation in Large Language Models with NICO: a Natural Interactive COnversation dataset

no code implementations18 Aug 2024 Renliang Sun, Mengyuan Liu, Shiping Yang, Rui Wang, Junqing He, Jiaxing Zhang

Benefiting from diverse instruction datasets, contemporary Large Language Models (LLMs) perform effectively as AI assistants in collaborating with humans.

Sentence

PITN: Physics-Informed Temporal Networks for Cuffless Blood Pressure Estimation

1 code implementation16 Aug 2024 Rui Wang, Mengshi Qi, Yingxia Shao, Anfu Zhou, Huadong Ma

To tackle this challenge, we introduce a novel physics-informed temporal network~(PITN) with adversarial contrastive learning to enable precise BP estimation with very limited data.

Blood pressure estimation Contrastive Learning +1

Sparse Deep Learning Models with the $\ell_1$ Regularization

no code implementations5 Aug 2024 Lixin Shen, Rui Wang, Yuesheng Xu, Mingsong Yan

Sparse neural networks are highly desirable in deep learning in reducing its complexity.

Deep Learning

The Llama 3 Herd of Models

2 code implementations31 Jul 2024 Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang, Bobbie Chern, Charlotte Caucheteux, Chaya Nayak, Chloe Bi, Chris Marra, Chris McConnell, Christian Keller, Christophe Touret, Chunyang Wu, Corinne Wong, Cristian Canton Ferrer, Cyrus Nikolaidis, Damien Allonsius, Daniel Song, Danielle Pintz, Danny Livshits, Danny Wyatt, David Esiobu, Dhruv Choudhary, Dhruv Mahajan, Diego Garcia-Olano, Diego Perino, Dieuwke Hupkes, Egor Lakomkin, Ehab AlBadawy, Elina Lobanova, Emily Dinan, Eric Michael Smith, Filip Radenovic, Francisco Guzmán, Frank Zhang, Gabriel Synnaeve, Gabrielle Lee, Georgia Lewis Anderson, Govind Thattai, Graeme Nail, Gregoire Mialon, Guan Pang, Guillem Cucurell, Hailey Nguyen, Hannah Korevaar, Hu Xu, Hugo Touvron, Iliyan Zarov, Imanol Arrieta Ibarra, Isabel Kloumann, Ishan Misra, Ivan Evtimov, Jack Zhang, Jade Copet, Jaewon Lee, Jan Geffert, Jana Vranes, Jason Park, Jay Mahadeokar, Jeet Shah, Jelmer Van der Linde, Jennifer Billock, Jenny Hong, Jenya Lee, Jeremy Fu, Jianfeng Chi, Jianyu Huang, Jiawen Liu, Jie Wang, Jiecao Yu, Joanna Bitton, Joe Spisak, Jongsoo Park, Joseph Rocca, Joshua Johnstun, Joshua Saxe, Junteng Jia, Kalyan Vasuden Alwala, Karthik Prasad, Kartikeya Upasani, Kate Plawiak, Ke Li, Kenneth Heafield, Kevin Stone, Khalid El-Arini, Krithika Iyer, Kshitiz Malik, Kuenley Chiu, Kunal Bhalla, Kushal Lakhotia, Lauren Rantala-Yeary, Laurens van der Maaten, Lawrence Chen, Liang Tan, Liz Jenkins, Louis Martin, Lovish Madaan, Lubo Malo, Lukas Blecher, Lukas Landzaat, Luke de Oliveira, Madeline Muzzi, Mahesh Pasupuleti, Mannat Singh, Manohar Paluri, Marcin Kardas, Maria Tsimpoukelli, Mathew Oldham, Mathieu Rita, Maya Pavlova, Melanie Kambadur, Mike Lewis, Min Si, Mitesh Kumar Singh, Mona Hassan, Naman Goyal, Narjes Torabi, Nikolay Bashlykov, Nikolay Bogoychev, Niladri Chatterji, Ning Zhang, Olivier Duchenne, Onur Çelebi, Patrick Alrassy, Pengchuan Zhang, Pengwei Li, Petar Vasic, Peter Weng, Prajjwal Bhargava, Pratik Dubal, Praveen Krishnan, Punit Singh Koura, Puxin Xu, Qing He, Qingxiao Dong, Ragavan Srinivasan, Raj Ganapathy, Ramon Calderer, Ricardo Silveira Cabral, Robert Stojnic, Roberta Raileanu, Rohan Maheswari, Rohit Girdhar, Rohit Patel, Romain Sauvestre, Ronnie Polidoro, Roshan Sumbaly, Ross Taylor, Ruan Silva, Rui Hou, Rui Wang, Saghar Hosseini, Sahana Chennabasappa, Sanjay Singh, Sean Bell, Seohyun Sonia Kim, Sergey Edunov, Shaoliang Nie, Sharan Narang, Sharath Raparthy, Sheng Shen, Shengye Wan, Shruti Bhosale, Shun Zhang, Simon Vandenhende, Soumya Batra, Spencer Whitman, Sten Sootla, Stephane Collot, Suchin Gururangan, Sydney Borodinsky, Tamar Herman, Tara Fowler, Tarek Sheasha, Thomas Georgiou, Thomas Scialom, Tobias Speckbacher, Todor Mihaylov, Tong Xiao, Ujjwal Karn, Vedanuj Goswami, Vibhor Gupta, Vignesh Ramanathan, Viktor Kerkez, Vincent Gonguet, Virginie Do, Vish Vogeti, Vítor Albiero, Vladan Petrovic, Weiwei Chu, Wenhan Xiong, Wenyin Fu, Whitney Meers, Xavier Martinet, Xiaodong Wang, Xiaofang Wang, Xiaoqing Ellen Tan, Xide Xia, Xinfeng Xie, Xuchao Jia, Xuewei Wang, Yaelle Goldschlag, Yashesh Gaur, Yasmine Babaei, Yi Wen, Yiwen Song, Yuchen Zhang, Yue Li, Yuning Mao, Zacharie Delpierre Coudert, Zheng Yan, Zhengxing Chen, Zoe Papakipos, Aaditya Singh, Aayushi Srivastava, Abha Jain, Adam Kelsey, Adam Shajnfeld, Adithya Gangidi, Adolfo Victoria, Ahuva Goldstand, Ajay Menon, Ajay Sharma, Alex Boesenberg, Alexei Baevski, Allie Feinstein, Amanda Kallet, Amit Sangani, Amos Teo, Anam Yunus, Andrei Lupu, Andres Alvarado, Andrew Caples, Andrew Gu, Andrew Ho, Andrew Poulton, Andrew Ryan, Ankit Ramchandani, Annie Dong, Annie Franco, Anuj Goyal, Aparajita Saraf, Arkabandhu Chowdhury, Ashley Gabriel, Ashwin Bharambe, Assaf Eisenman, Azadeh Yazdan, Beau James, Ben Maurer, Benjamin Leonhardi, Bernie Huang, Beth Loyd, Beto De Paola, Bhargavi Paranjape, Bing Liu, Bo Wu, Boyu Ni, Braden Hancock, Bram Wasti, Brandon Spence, Brani Stojkovic, Brian Gamido, Britt Montalvo, Carl Parker, Carly Burton, Catalina Mejia, Ce Liu, Changhan Wang, Changkyu Kim, Chao Zhou, Chester Hu, Ching-Hsiang Chu, Chris Cai, Chris Tindal, Christoph Feichtenhofer, Cynthia Gao, Damon Civin, Dana Beaty, Daniel Kreymer, Daniel Li, David Adkins, David Xu, Davide Testuggine, Delia David, Devi Parikh, Diana Liskovich, Didem Foss, Dingkang Wang, Duc Le, Dustin Holland, Edward Dowling, Eissa Jamil, Elaine Montgomery, Eleonora Presani, Emily Hahn, Emily Wood, Eric-Tuan Le, Erik Brinkman, Esteban Arcaute, Evan Dunbar, Evan Smothers, Fei Sun, Felix Kreuk, Feng Tian, Filippos Kokkinos, Firat Ozgenel, Francesco Caggioni, Frank Kanayet, Frank Seide, Gabriela Medina Florez, Gabriella Schwarz, Gada Badeer, Georgia Swee, Gil Halpern, Grant Herman, Grigory Sizov, Guangyi, Zhang, Guna Lakshminarayanan, Hakan Inan, Hamid Shojanazeri, Han Zou, Hannah Wang, Hanwen Zha, Haroun Habeeb, Harrison Rudolph, Helen Suk, Henry Aspegren, Hunter Goldman, Hongyuan Zhan, Ibrahim Damlaj, Igor Molybog, Igor Tufanov, Ilias Leontiadis, Irina-Elena Veliche, Itai Gat, Jake Weissman, James Geboski, James Kohli, Janice Lam, Japhet Asher, Jean-Baptiste Gaya, Jeff Marcus, Jeff Tang, Jennifer Chan, Jenny Zhen, Jeremy Reizenstein, Jeremy Teboul, Jessica Zhong, Jian Jin, Jingyi Yang, Joe Cummings, Jon Carvill, Jon Shepard, Jonathan McPhie, Jonathan Torres, Josh Ginsburg, Junjie Wang, Kai Wu, Kam Hou U, Karan Saxena, Kartikay Khandelwal, Katayoun Zand, Kathy Matosich, Kaushik Veeraraghavan, Kelly Michelena, Keqian Li, Kiran Jagadeesh, Kun Huang, Kunal Chawla, Kyle Huang, Lailin Chen, Lakshya Garg, Lavender A, Leandro Silva, Lee Bell, Lei Zhang, Liangpeng Guo, Licheng Yu, Liron Moshkovich, Luca Wehrstedt, Madian Khabsa, Manav Avalani, Manish Bhatt, Martynas Mankus, Matan Hasson, Matthew Lennie, Matthias Reso, Maxim Groshev, Maxim Naumov, Maya Lathi, Meghan Keneally, Miao Liu, Michael L. Seltzer, Michal Valko, Michelle Restrepo, Mihir Patel, Mik Vyatskov, Mikayel Samvelyan, Mike Clark, Mike Macey, Mike Wang, Miquel Jubert Hermoso, Mo Metanat, Mohammad Rastegari, Munish Bansal, Nandhini Santhanam, Natascha Parks, Natasha White, Navyata Bawa, Nayan Singhal, Nick Egebo, Nicolas Usunier, Nikhil Mehta, Nikolay Pavlovich Laptev, Ning Dong, Norman Cheng, Oleg Chernoguz, Olivia Hart, Omkar Salpekar, Ozlem Kalinli, Parkin Kent, Parth Parekh, Paul Saab, Pavan Balaji, Pedro Rittner, Philip Bontrager, Pierre Roux, Piotr Dollar, Polina Zvyagina, Prashant Ratanchandani, Pritish Yuvraj, Qian Liang, Rachad Alao, Rachel Rodriguez, Rafi Ayub, Raghotham Murthy, Raghu Nayani, Rahul Mitra, Rangaprabhu Parthasarathy, Raymond Li, Rebekkah Hogan, Robin Battey, Rocky Wang, Russ Howes, Ruty Rinott, Sachin Mehta, Sachin Siby, Sai Jayesh Bondu, Samyak Datta, Sara Chugh, Sara Hunt, Sargun Dhillon, Sasha Sidorov, Satadru Pan, Saurabh Mahajan, Saurabh Verma, Seiji Yamamoto, Sharadh Ramaswamy, Shaun Lindsay, Sheng Feng, Shenghao Lin, Shengxin Cindy Zha, Shishir Patil, Shiva Shankar, Shuqiang Zhang, Sinong Wang, Sneha Agarwal, Soji Sajuyigbe, Soumith Chintala, Stephanie Max, Stephen Chen, Steve Kehoe, Steve Satterfield, Sudarshan Govindaprasad, Sumit Gupta, Summer Deng, Sungmin Cho, Sunny Virk, Suraj Subramanian, Sy Choudhury, Sydney Goldman, Tal Remez, Tamar Glaser, Tamara Best, Thilo Koehler, Thomas Robinson, Tianhe Li, Tianjun Zhang, Tim Matthews, Timothy Chou, Tzook Shaked, Varun Vontimitta, Victoria Ajayi, Victoria Montanez, Vijai Mohan, Vinay Satish Kumar, Vishal Mangla, Vlad Ionescu, Vlad Poenaru, Vlad Tiberiu Mihailescu, Vladimir Ivanov, Wei Li, Wenchen Wang, WenWen Jiang, Wes Bouaziz, Will Constable, Xiaocheng Tang, Xiaojian Wu, Xiaolan Wang, Xilun Wu, Xinbo Gao, Yaniv Kleinman, Yanjun Chen, Ye Hu, Ye Jia, Ye Qi, Yenda Li, Yilin Zhang, Ying Zhang, Yossi Adi, Youngjin Nam, Yu, Wang, Yu Zhao, Yuchen Hao, Yundi Qian, Yunlu Li, Yuzi He, Zach Rait, Zachary DeVito, Zef Rosnbrick, Zhaoduo Wen, Zhenyu Yang, Zhiwei Zhao, Zhiyu Ma

This paper presents a new set of foundation models, called Llama 3.

answerability prediction Language Modeling +5

FTuner: A Fast Dynamic Shape Tensors Program Auto-Tuner for Deep Learning Compilers

no code implementations31 Jul 2024 Pengyu Mu, Linquan Wei, Yi Liu, Rui Wang

Instead of using a large design space or training a cost model, we use an abstract computational unit called the uKernel to patch together small, various-sized tensors to match the shape of the input tensor.

Relaxed Equivariant Graph Neural Networks

1 code implementation30 Jul 2024 Elyssa Hofgard, Rui Wang, Robin Walters, Tess Smidt

3D Euclidean symmetry equivariant neural networks have demonstrated notable success in modeling complex physical systems.

Longhorn: State Space Models are Amortized Online Learners

1 code implementation19 Jul 2024 Bo Liu, Rui Wang, Lemeng Wu, Yihao Feng, Peter Stone, Qiang Liu

Our experimental results show that Longhorn outperforms state-of-the-art SSMs, including the Mamba model, on standard sequence modeling benchmarks, language modeling, and vision tasks.

Language Modeling Language Modelling +2

Restoring Images in Adverse Weather Conditions via Histogram Transformer

1 code implementation14 Jul 2024 Shangquan Sun, Wenqi Ren, Xinwei Gao, Rui Wang, Xiaochun Cao

It is powered by a mechanism dubbed histogram self-attention, which sorts and segments spatial features into intensity-based bins.

Image Restoration

DiffPhyCon: A Generative Approach to Control Complex Physical Systems

1 code implementation9 Jul 2024 Long Wei, Peiyan Hu, Ruiqi Feng, Haodong Feng, Yixuan Du, Tao Zhang, Rui Wang, Yue Wang, Zhi-Ming Ma, Tailin Wu

In this work, we introduce Diffusion Physical systems Control (DiffPhyCon), a new class of method to address the physical systems control problem.

reinforcement-learning Reinforcement Learning

Evaluating Knowledge-based Cross-lingual Inconsistency in Large Language Models

1 code implementation1 Jul 2024 Xiaolin Xing, Zhiwei He, Haoyu Xu, Xing Wang, Rui Wang, Yu Hong

This paper investigates the cross-lingual inconsistencies observed in Large Language Models (LLMs), such as ChatGPT, Llama, and Baichuan, which have shown exceptional performance in various Natural Language Processing (NLP) tasks.

FAFE: Immune Complex Modeling with Geodesic Distance Loss on Noisy Group Frames

no code implementations1 Jul 2024 Ruidong Wu, Ruihan Guo, Rui Wang, Shitong Luo, Yue Xu, Jiahan Li, Jianzhu Ma, Qiang Liu, Yunan Luo, Jian Peng

Despite the striking success of general protein folding models such as AlphaFold2(AF2, Jumper et al. (2021)), the accurate computational modeling of antibody-antigen complexes remains a challenging task.

Protein Folding

Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation

1 code implementation24 Jun 2024 Xueyu Liu, Guangze Shi, Rui Wang, Yexin Lai, Jianan Zhang, Lele Sun, Quan Yang, Yongfei Wu, Ming Li, Weixia Han, Wen Zheng

Experimental results on our collected 2538 TEM images confirm that GBMSeg achieves superior segmentation performance with a Dice similarity coefficient (DSC) of 87. 27% using only one labeled reference image in a training-free manner, outperforming recently proposed one-shot or few-shot methods.

Prompt Engineering Segmentation

Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition

no code implementations22 Jun 2024 Kai Shao, Rui Wang, Yixue Hao, Long Hu, Min Chen, Hans Arno Jacobsen

Furthermore, to enhance the learning of semantic representation associated with stimulation tasks, a semantic consistency contrast module is proposed, aiming to maximize the semantic similarity of fNIRS and EEG.

Data Augmentation EEG +3

DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving

1 code implementation18 Jun 2024 Yuxuan Tong, Xiwen Zhang, Rui Wang, Ruidong Wu, Junxian He

Solving mathematical problems requires advanced reasoning abilities and presents notable challenges for large language models.

Ranked #4 on Natural Questions on TheoremQA (using extra training data)

Arithmetic Reasoning Math +3

Balancing Performance and Cost for Two-Hop Cooperative Communications: Stackelberg Game and Distributed Multi-Agent Reinforcement Learning

1 code implementation17 Jun 2024 Yuanzhe Geng, Erwu Liu, Wei Ni, Rui Wang, Yan Liu, Hao Xu, Chen Cai, Abbas Jamalipour

This paper aims to balance performance and cost in a two-hop wireless cooperative communication network where the source and relays have contradictory optimization goals and make decisions in a distributed manner.

Multi-agent Reinforcement Learning

On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey

no code implementations14 Jun 2024 Lin Long, Rui Wang, Ruixuan Xiao, Junbo Zhao, Xiao Ding, Gang Chen, Haobo Wang

Within the evolving landscape of deep learning, the dilemma of data quantity and quality has been a long-standing problem.

Synthetic Data Generation

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

1 code implementation13 Jun 2024 Yucheng Han, Rui Wang, Chi Zhang, Juntao Hu, Pei Cheng, Bin Fu, Hanwang Zhang

Recent advancements in image generation have enabled the creation of high-quality images from text conditions.

Conditional Image Generation

Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding

no code implementations12 Jun 2024 Rui Wang, Liping Chen, Kong Aik Lee, Zhen-Hua Ling

Voice anonymization has been developed as a technique for preserving privacy by replacing the speaker's voice in a speech signal with that of a pseudo-speaker, thereby obscuring the original voice attributes from machine recognition and human perception.

Disentanglement

Triage of 3D pathology data via 2.5D multiple-instance learning to guide pathologist assessments

1 code implementation11 Jun 2024 Gan Gao, Andrew H. Song, Fiona Wang, David Brenes, Rui Wang, Sarah S. L. Chow, Kevin W. Bishop, Lawrence D. True, Faisal Mahmood, Jonathan T. C. Liu

A potential early route towards clinical adoption for 3D pathology is to rely on pathologists for final diagnosis based on viewing familiar 2D H&E-like image sections from the 3D datasets.

Diagnostic Multiple Instance Learning

Fast-Fading Channel and Power Optimization of the Magnetic Inductive Cellular Network

no code implementations7 Jun 2024 Honglei Ma, Erwu Liu, Zhijun Fang, Rui Wang, Yongbin Gao, Wenjun Yu, Dongming Zhang

Finally, we propose the power control algorithm using the non-cooperative game and multiagent Q-learning methods to optimize the throughput of the cellular VMI network.

Q-Learning

Graph Neural Network Enhanced Retrieval for Question Answering of LLMs

no code implementations3 Jun 2024 Zijian Li, Qingyan Guo, Jiawei Shao, Lei Song, Jiang Bian, Jun Zhang, Rui Wang

A graph neural network (GNN) is then leveraged to exploit the relationships between passages and improve the retrieval of supporting passages.

Graph Neural Network Language Modelling +3

A Recipe for Charge Density Prediction

no code implementations29 May 2024 Xiang Fu, Andrew Rosen, Kyle Bystrom, Rui Wang, Albert Musaelian, Boris Kozinsky, Tess Smidt, Tommi Jaakkola

In density functional theory, charge density is the core attribute of atomic systems from which all chemical properties can be derived.

Attribute Prediction

Floor-Plan-aided Indoor Localization: Zero-Shot Learning Framework, Data Sets, and Prototype

1 code implementation22 May 2024 Haiyao Yu, Changyang She, Yunkai Hu, Geng Wang, Rui Wang, Branka Vucetic, Yonghui Li

Specifically, a graph neural network that is scalable to the number of access points (APs) and mobile devices (MDs) is used for obtaining coarse locations of MDs.

Graph Neural Network Indoor Localization +1

Ultra-Fast Adaptive Track Detection Network

1 code implementation22 May 2024 Hai Ni, Rui Wang, Scarlett Liu

Compared to the SOTA, the proposed model is competitive in both speed and accuracy.

Multiple-Choice Questions are Efficient and Robust LLM Evaluators

1 code implementation20 May 2024 Ziyin Zhang, Zhaokun Jiang, Lizhen Xu, Hongkun Hao, Rui Wang

We present GSM-MC, a multiple-choice (MC) dataset constructed by collecting answers and incorrect predictions on GSM8K from 60 open-source models.

GSM8K HumanEval +3

When Large Language Model Meets Optimization

no code implementations16 May 2024 Sen Huang, Kaixiang Yang, Sheng Qi, Rui Wang

Optimization algorithms and large language models (LLMs) enhance decision-making in dynamic environments by integrating artificial intelligence with traditional techniques.

Decision Making Language Modeling +3

Dual-Segment Clustering Strategy for Hierarchical Federated Learning in Heterogeneous Wireless Environments

no code implementations15 May 2024 Pengcheng Sun, Erwu Liu, Wei Ni, Kanglei Yu, Xinyu Qu, Rui Wang, Yanlong Bi, Chuanchun Zhang, Abbas Jamalipour

Non-independent and identically distributed (Non- IID) data adversely affects federated learning (FL) while heterogeneity in communication quality can undermine the reliability of model parameter transmission, potentially degrading wireless FL convergence.

Clustering Federated Learning

ReinWiFi: A Reinforcement-Learning-Based Framework for the Application-Layer QoS Optimization of WiFi Networks

1 code implementation6 May 2024 Qianren Li, Bojie Lv, Yuncong Hong, Rui Wang

In this paper, a reinforcement-learning-based scheduling framework is proposed and implemented to optimize the application-layer quality-of-service (QoS) of a practical wireless local area network (WLAN) suffering from unknown interference.

reinforcement-learning Reinforcement Learning +1

A Comprehensive Survey of Dynamic Graph Neural Networks: Models, Frameworks, Benchmarks, Experiments and Challenges

no code implementations1 May 2024 ZhengZhao Feng, Rui Wang, Tianxing Wang, Mingli Song, Sai Wu, Shuibing He

From the analysis and evaluation results, we identify key challenges and offer principles for future research to enhance the design of models and frameworks in the dynamic GNNs field.

PAD: Patch-Agnostic Defense against Adversarial Patch Attacks

1 code implementation CVPR 2024 Lihua Jing, Rui Wang, Wenqi Ren, Xin Dong, Cong Zou

Adversarial patch attacks present a significant threat to real-world object detectors due to their practical feasibility.

Nyonic Technical Report

1 code implementation24 Apr 2024 Junfeng Tian, Rui Wang, Cong Li, Yudong Zhou, Jun Liu, Jun Wang

This report details the development and key achievements of our latest language model designed for custom large language models.

Language Modeling Language Modelling

Beyond Chain-of-Thought: A Survey of Chain-of-X Paradigms for LLMs

no code implementations24 Apr 2024 Yu Xia, Rui Wang, Xu Liu, Mingyan Li, Tong Yu, Xiang Chen, Julian McAuley, Shuai Li

Chain-of-Thought (CoT) has been a widely adopted prompting method, eliciting impressive reasoning abilities of Large Language Models (LLMs).

Survey

SNR Maximization and Localization for UAV-IRS-Assisted Near-Field Systems

no code implementations24 Apr 2024 Hanfu Zhang, Yidan Mei, Erwu Liu, Rui Wang

This letter introduces a novel unmanned aerial vehicle (UAV)-intelligent reflecting surface (IRS) structure into near-field localization systems to enhance the design flexibility of IRS, thereby obtaining additional performance gains.

Rechargeable UAV Trajectory Optimization for Real-Time Persistent Data Collection of Large-Scale Sensor Networks

no code implementations24 Apr 2024 Rui Wang, Deshi Li, Qingqing Wu, Kaitao Meng, Boning Feng, Lele Cong

In this paper, we propose a rechargeable UAV-assisted periodic data collection scheme, where a UAV is dispatched to periodically collect data from sensor nodes (SNs) in the mission area and charged by a wireless charging platform.

Research on OPF control of three-phase four-wire low-voltage distribution network considering uncertainty

no code implementations24 Apr 2024 Rui Wang, Xiaoqing Bai, Shengquan Huang, Shoupu Wei

As power systems become more complex and uncertain, low-voltage distribution networks face numerous challenges, including three-phase imbalances caused by asymmetrical loads and distributed energy resources.

Stochastic Optimization

A SER-based Device Selection Mechanism in Multi-bits Quantization Federated Learning

no code implementations20 Apr 2024 Pengcheng Sun, Erwu Liu, Rui Wang

The quality of wireless communication will directly affect the performance of federated learning (FL), so this paper analyze the influence of wireless communication on FL through symbol error rate (SER).

Federated Learning Quantization

The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data

1 code implementation ICCV 2023 Zixuan Zhu, Rui Wang, Cong Zou, Lihua Jing

This inspires us to propose a novel dual-network training framework: The Victim and The Beneficiary (V&B), which exploits a poisoned model to train a clean model without extra benign samples.

Data Augmentation model

Cross-Modality Gait Recognition: Bridging LiDAR and Camera Modalities for Human Identification

no code implementations4 Apr 2024 Rui Wang, Chuanfu Shen, Manuel J. Marin-Jimenez, George Q. Huang, Shiqi Yu

Current gait recognition research mainly focuses on identifying pedestrians captured by the same type of sensor, neglecting the fact that individuals may be captured by different sensors in order to adapt to various environments.

Gait Recognition

Embedding-Informed Adaptive Retrieval-Augmented Generation of Large Language Models

no code implementations4 Apr 2024 Chengkai Huang, Yu Xia, Rui Wang, Kaige Xie, Tong Yu, Julian McAuley, Lina Yao

However, it was observed by previous works that retrieval is not always helpful, especially when the LLM is already knowledgeable on the query to answer.

Continual Learning Retrieval

Interpretable Multi-View Clustering Based on Anchor Graph Tensor Factorization

no code implementations1 Apr 2024 Rui Wang, Jing Li, Quanxue Gao, Cheng Deng

Nevertheless, existing multi-view clustering methods based on anchor graph factorization lack adequate cluster interpretability for the decomposed matrix and often overlook the inter-view information.

Clustering

Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

no code implementations29 Mar 2024 Luchang Li, Sheng Qian, Jie Lu, Lunxi Yuan, Rui Wang, Qin Xie

The Large Language Model (LLM) is widely employed for tasks such as intelligent assistants, text summarization, translation, and multi-modality on mobile phones.

Language Modelling Large Language Model +2

Enhanced Short Text Modeling: Leveraging Large Language Models for Topic Refinement

1 code implementation26 Mar 2024 Shuyu Chang, Rui Wang, Peng Ren, Haiping Huang

Crafting effective topic models for brief texts, like tweets and news headlines, is essential for capturing the swift shifts in social dynamics.

Prompt Engineering Topic Models

InternLM2 Technical Report

3 code implementations26 Mar 2024 Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

4k Long-Context Understanding

Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents

no code implementations23 Mar 2024 Hao Wang, Tang Li, Chenhui Chu, Nengjun Zhu, Rui Wang, Pinpin Zhu

This approach aims to generate relation representations that are more aware of the spatial context and unseen relation in a manner similar to human perception.

Document AI Reading Comprehension +2

AVT2-DWF: Improving Deepfake Detection with Audio-Visual Fusion and Dynamic Weighting Strategies

1 code implementation22 Mar 2024 Rui Wang, Dengpan Ye, Long Tang, Yunming Zhang, Jiacheng Deng

With the continuous improvements of deepfake methods, forgery messages have transitioned from single-modality to multi-modal fusion, posing new challenges for existing forgery detection algorithms.

DeepFake Detection Face Swapping

VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place Recognition

1 code implementation21 Mar 2024 Yun-Jin Li, Mariia Gladkova, Yan Xia, Rui Wang, Daniel Cremers

To tackle this issue, we propose Voxel-Cross-Pixel (VXP), a novel camera-to-LiDAR place recognition framework that enforces local similarities in a self-supervised manner and effectively brings global context from images and LiDAR scans into a shared feature space.

Cross-modal place recognition Cross-Modal Retrieval +1

AffineQuant: Affine Transformation Quantization for Large Language Models

1 code implementation19 Mar 2024 Yuexiao Ma, Huixia Li, Xiawu Zheng, Feng Ling, Xuefeng Xiao, Rui Wang, Shilei Wen, Fei Chao, Rongrong Ji

Among these techniques, Post-Training Quantization (PTQ) has emerged as a subject of considerable interest due to its noteworthy compression efficiency and cost-effectiveness in the context of training.

Quantization

StableGarment: Garment-Centric Generation via Stable Diffusion

no code implementations16 Mar 2024 Rui Wang, Hailong Guo, Jiaming Liu, Huaxia Li, Haibo Zhao, Xu Tang, Yao Hu, Hao Tang, Peipei Li

In this paper, we introduce StableGarment, a unified framework to tackle garment-centric(GC) generation tasks, including GC text-to-image, controllable GC text-to-image, stylized GC text-to-image, and robust virtual try-on.

Denoising Image Generation +1

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

2 code implementations8 Mar 2024 XiWei Hu, Rui Wang, Yixiao Fang, Bin Fu, Pei Cheng, Gang Yu

Diffusion models have demonstrated remarkable performance in the domain of text-to-image generation.

Denoising Language Modelling +2

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

1 code implementation8 Mar 2024 Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love, Paul Voigtlaender, Rohan Jain, Gabriela Surita, Kareem Mohamed, Rory Blevins, Junwhan Ahn, Tao Zhu, Kornraphop Kawintiranon, Orhan Firat, Yiming Gu, Yujing Zhang, Matthew Rahtz, Manaal Faruqui, Natalie Clay, Justin Gilmer, JD Co-Reyes, Ivo Penchev, Rui Zhu, Nobuyuki Morioka, Kevin Hui, Krishna Haridasan, Victor Campos, Mahdis Mahdieh, Mandy Guo, Samer Hassan, Kevin Kilgour, Arpi Vezer, Heng-Tze Cheng, Raoul de Liedekerke, Siddharth Goyal, Paul Barham, DJ Strouse, Seb Noury, Jonas Adler, Mukund Sundararajan, Sharad Vikram, Dmitry Lepikhin, Michela Paganini, Xavier Garcia, Fan Yang, Dasha Valter, Maja Trebacz, Kiran Vodrahalli, Chulayuth Asawaroengchai, Roman Ring, Norbert Kalb, Livio Baldini Soares, Siddhartha Brahma, David Steiner, Tianhe Yu, Fabian Mentzer, Antoine He, Lucas Gonzalez, Bibo Xu, Raphael Lopez Kaufman, Laurent El Shafey, Junhyuk Oh, Tom Hennigan, George van den Driessche, Seth Odoom, Mario Lucic, Becca Roelofs, Sid Lall, Amit Marathe, Betty Chan, Santiago Ontanon, Luheng He, Denis Teplyashin, Jonathan Lai, Phil Crone, Bogdan Damoc, Lewis Ho, Sebastian Riedel, Karel Lenc, Chih-Kuan Yeh, Aakanksha Chowdhery, Yang Xu, Mehran Kazemi, Ehsan Amid, Anastasia Petrushkina, Kevin Swersky, Ali Khodaei, Gowoon Chen, Chris Larkin, Mario Pinto, Geng Yan, Adria Puigdomenech Badia, Piyush Patil, Steven Hansen, Dave Orr, Sebastien M. R. Arnold, Jordan Grimstad, Andrew Dai, Sholto Douglas, Rishika Sinha, Vikas Yadav, Xi Chen, Elena Gribovskaya, Jacob Austin, Jeffrey Zhao, Kaushal Patel, Paul Komarek, Sophia Austin, Sebastian Borgeaud, Linda Friso, Abhimanyu Goyal, Ben Caine, Kris Cao, Da-Woon Chung, Matthew Lamm, Gabe Barth-Maron, Thais Kagohara, Kate Olszewska, Mia Chen, Kaushik Shivakumar, Rishabh Agarwal, Harshal Godhia, Ravi Rajwar, Javier Snaider, Xerxes Dotiwalla, YuAn Liu, Aditya Barua, Victor Ungureanu, Yuan Zhang, Bat-Orgil Batsaikhan, Mateo Wirth, James Qin, Ivo Danihelka, Tulsee Doshi, Martin Chadwick, Jilin Chen, Sanil Jain, Quoc Le, Arjun Kar, Madhu Gurumurthy, Cheng Li, Ruoxin Sang, Fangyu Liu, Lampros Lamprou, Rich Munoz, Nathan Lintz, Harsh Mehta, Heidi Howard, Malcolm Reynolds, Lora Aroyo, Quan Wang, Lorenzo Blanco, Albin Cassirer, Jordan Griffith, Dipanjan Das, Stephan Lee, Jakub Sygnowski, Zach Fisher, James Besley, Richard Powell, Zafarali Ahmed, Dominik Paulus, David Reitter, Zalan Borsos, Rishabh Joshi, Aedan Pope, Steven Hand, Vittorio Selo, Vihan Jain, Nikhil Sethi, Megha Goel, Takaki Makino, Rhys May, Zhen Yang, Johan Schalkwyk, Christina Butterfield, Anja Hauth, Alex Goldin, Will Hawkins, Evan Senter, Sergey Brin, Oliver Woodman, Marvin Ritter, Eric Noland, Minh Giang, Vijay Bolina, Lisa Lee, Tim Blyth, Ian Mackinnon, Machel Reid, Obaid Sarvana, David Silver, Alexander Chen, Lily Wang, Loren Maggiore, Oscar Chang, Nithya Attaluri, Gregory Thornton, Chung-Cheng Chiu, Oskar Bunyan, Nir Levine, Timothy Chung, Evgenii Eltyshev, Xiance Si, Timothy Lillicrap, Demetra Brady, Vaibhav Aggarwal, Boxi Wu, Yuanzhong Xu, Ross Mcilroy, Kartikeya Badola, Paramjit Sandhu, Erica Moreira, Wojciech Stokowiec, Ross Hemsley, Dong Li, Alex Tudor, Pranav Shyam, Elahe Rahimtoroghi, Salem Haykal, Pablo Sprechmann, Xiang Zhou, Diana Mincu, Yujia Li, Ravi Addanki, Kalpesh Krishna, Xiao Wu, Alexandre Frechette, Matan Eyal, Allan Dafoe, Dave Lacey, Jay Whang, Thi Avrahami, Ye Zhang, Emanuel Taropa, Hanzhao Lin, Daniel Toyama, Eliza Rutherford, Motoki Sano, HyunJeong Choe, Alex Tomala, Chalence Safranek-Shrader, Nora Kassner, Mantas Pajarskas, Matt Harvey, Sean Sechrist, Meire Fortunato, Christina Lyu, Gamaleldin Elsayed, Chenkai Kuang, James Lottes, Eric Chu, Chao Jia, Chih-Wei Chen, Peter Humphreys, Kate Baumli, Connie Tao, Rajkumar Samuel, Cicero Nogueira dos santos, Anders Andreassen, Nemanja Rakićević, Dominik Grewe, Aviral Kumar, Stephanie Winkler, Jonathan Caton, Andrew Brock, Sid Dalmia, Hannah Sheahan, Iain Barr, Yingjie Miao, Paul Natsev, Jacob Devlin, Feryal Behbahani, Flavien Prost, Yanhua Sun, Artiom Myaskovsky, Thanumalayan Sankaranarayana Pillai, Dan Hurt, Angeliki Lazaridou, Xi Xiong, Ce Zheng, Fabio Pardo, Dan Horgan, Joe Stanton, Moran Ambar, Fei Xia, Alejandro Lince, Mingqiu Wang, Basil Mustafa, Albert Webson, Hyo Lee, Rohan Anil, Martin Wicke, Timothy Dozat, Abhishek Sinha, Enrique Piqueras, Elahe Dabir, Shyam Upadhyay, Anudhyan Boral, Lisa Anne Hendricks, Corey Fry, Josip Djolonga, Yi Su, Jake Walker, Jane Labanowski, Ronny Huang, Vedant Misra, Jeremy Chen, RJ Skerry-Ryan, Avi Singh, Shruti Rijhwani, Dian Yu, Alex Castro-Ros, Beer Changpinyo, Romina Datta, Sumit Bagri, Arnar Mar Hrafnkelsson, Marcello Maggioni, Daniel Zheng, Yury Sulsky, Shaobo Hou, Tom Le Paine, Antoine Yang, Jason Riesa, Dominika Rogozinska, Dror Marcus, Dalia El Badawy, Qiao Zhang, Luyu Wang, Helen Miller, Jeremy Greer, Lars Lowe Sjos, Azade Nova, Heiga Zen, Rahma Chaabouni, Mihaela Rosca, Jiepu Jiang, Charlie Chen, Ruibo Liu, Tara Sainath, Maxim Krikun, Alex Polozov, Jean-Baptiste Lespiau, Josh Newlan, Zeyncep Cankara, Soo Kwak, Yunhan Xu, Phil Chen, Andy Coenen, Clemens Meyer, Katerina Tsihlas, Ada Ma, Juraj Gottweis, Jinwei Xing, Chenjie Gu, Jin Miao, Christian Frank, Zeynep Cankara, Sanjay Ganapathy, Ishita Dasgupta, Steph Hughes-Fitt, Heng Chen, David Reid, Keran Rong, Hongmin Fan, Joost van Amersfoort, Vincent Zhuang, Aaron Cohen, Shixiang Shane Gu, Anhad Mohananey, Anastasija Ilic, Taylor Tobin, John Wieting, Anna Bortsova, Phoebe Thacker, Emma Wang, Emily Caveness, Justin Chiu, Eren Sezener, Alex Kaskasoli, Steven Baker, Katie Millican, Mohamed Elhawaty, Kostas Aisopos, Carl Lebsack, Nathan Byrd, Hanjun Dai, Wenhao Jia, Matthew Wiethoff, Elnaz Davoodi, Albert Weston, Lakshman Yagati, Arun Ahuja, Isabel Gao, Golan Pundak, Susan Zhang, Michael Azzam, Khe Chai Sim, Sergi Caelles, James Keeling, Abhanshu Sharma, Andy Swing, Yaguang Li, Chenxi Liu, Carrie Grimes Bostock, Yamini Bansal, Zachary Nado, Ankesh Anand, Josh Lipschultz, Abhijit Karmarkar, Lev Proleev, Abe Ittycheriah, Soheil Hassas Yeganeh, George Polovets, Aleksandra Faust, Jiao Sun, Alban Rrustemi, Pen Li, Rakesh Shivanna, Jeremiah Liu, Chris Welty, Federico Lebron, Anirudh Baddepudi, Sebastian Krause, Emilio Parisotto, Radu Soricut, Zheng Xu, Dawn Bloxwich, Melvin Johnson, Behnam Neyshabur, Justin Mao-Jones, Renshen Wang, Vinay Ramasesh, Zaheer Abbas, Arthur Guez, Constant Segal, Duc Dung Nguyen, James Svensson, Le Hou, Sarah York, Kieran Milan, Sophie Bridgers, Wiktor Gworek, Marco Tagliasacchi, James Lee-Thorp, Michael Chang, Alexey Guseynov, Ale Jakse Hartman, Michael Kwong, Ruizhe Zhao, Sheleem Kashem, Elizabeth Cole, Antoine Miech, Richard Tanburn, Mary Phuong, Filip Pavetic, Sebastien Cevey, Ramona Comanescu, Richard Ives, Sherry Yang, Cosmo Du, Bo Li, Zizhao Zhang, Mariko Iinuma, Clara Huiyi Hu, Aurko Roy, Shaan Bijwadia, Zhenkai Zhu, Danilo Martins, Rachel Saputro, Anita Gergely, Steven Zheng, Dawei Jia, Ioannis Antonoglou, Adam Sadovsky, Shane Gu, Yingying Bi, Alek Andreev, Sina Samangooei, Mina Khan, Tomas Kocisky, Angelos Filos, Chintu Kumar, Colton Bishop, Adams Yu, Sarah Hodkinson, Sid Mittal, Premal Shah, Alexandre Moufarek, Yong Cheng, Adam Bloniarz, Jaehoon Lee, Pedram Pejman, Paul Michel, Stephen Spencer, Vladimir Feinberg, Xuehan Xiong, Nikolay Savinov, Charlotte Smith, Siamak Shakeri, Dustin Tran, Mary Chesus, Bernd Bohnet, George Tucker, Tamara von Glehn, Carrie Muir, Yiran Mao, Hideto Kazawa, Ambrose Slone, Kedar Soparkar, Disha Shrivastava, James Cobon-Kerr, Michael Sharman, Jay Pavagadhi, Carlos Araya, Karolis Misiunas, Nimesh Ghelani, Michael Laskin, David Barker, Qiujia Li, Anton Briukhov, Neil Houlsby, Mia Glaese, Balaji Lakshminarayanan, Nathan Schucher, Yunhao Tang, Eli Collins, Hyeontaek Lim, Fangxiaoyu Feng, Adria Recasens, Guangda Lai, Alberto Magni, Nicola De Cao, Aditya Siddhant, Zoe Ashwood, Jordi Orbay, Mostafa Dehghani, Jenny Brennan, Yifan He, Kelvin Xu, Yang Gao, Carl Saroufim, James Molloy, Xinyi Wu, Seb Arnold, Solomon Chang, Julian Schrittwieser, Elena Buchatskaya, Soroush Radpour, Martin Polacek, Skye Giordano, Ankur Bapna, Simon Tokumine, Vincent Hellendoorn, Thibault Sottiaux, Sarah Cogan, Aliaksei Severyn, Mohammad Saleh, Shantanu Thakoor, Laurent Shefey, Siyuan Qiao, Meenu Gaba, Shuo-Yiin Chang, Craig Swanson, Biao Zhang, Benjamin Lee, Paul Kishan Rubenstein, Gan Song, Tom Kwiatkowski, Anna Koop, Ajay Kannan, David Kao, Parker Schuh, Axel Stjerngren, Golnaz Ghiasi, Gena Gibson, Luke Vilnis, Ye Yuan, Felipe Tiengo Ferreira, Aishwarya Kamath, Ted Klimenko, Ken Franko, Kefan Xiao, Indro Bhattacharya, Miteyan Patel, Rui Wang, Alex Morris, Robin Strudel, Vivek Sharma, Peter Choy, Sayed Hadi Hashemi, Jessica Landon, Mara Finkelstein, Priya Jhakra, Justin Frye, Megan Barnes, Matthew Mauger, Dennis Daun, Khuslen Baatarsukh, Matthew Tung, Wael Farhan, Henryk Michalewski, Fabio Viola, Felix de Chaumont Quitry, Charline Le Lan, Tom Hudson, Qingze Wang, Felix Fischer, Ivy Zheng, Elspeth White, Anca Dragan, Jean-Baptiste Alayrac, Eric Ni, Alexander Pritzel, Adam Iwanicki, Michael Isard, Anna Bulanova, Lukas Zilka, Ethan Dyer, Devendra Sachan, Srivatsan Srinivasan, Hannah Muckenhirn, Honglong Cai, Amol Mandhane, Mukarram Tariq, Jack W. Rae, Gary Wang, Kareem Ayoub, Nicholas FitzGerald, Yao Zhao, Woohyun Han, Chris Alberti, Dan Garrette, Kashyap Krishnakumar, Mai Gimenez, Anselm Levskaya, Daniel Sohn, Josip Matak, Inaki Iturrate, Michael B. Chang, Jackie Xiang, Yuan Cao, Nishant Ranka, Geoff Brown, Adrian Hutter, Nanxin Chen, Kaisheng Yao, Zoltan Egyed, Francois Galilee, Tyler Liechty, Praveen Kallakuri, Evan Palmer, Sanjay Ghemawat, Jasmine Liu, David Tao, Chloe Thornton, Tim Green, Mimi Jasarevic, Sharon Lin, Victor Cotruta, Yi-Xuan Tan, Noah Fiedel, Hongkun Yu, Ed Chi, Alexander Neitz, Jens Heitkaemper, Anu Sinha, Denny Zhou, Yi Sun, Charbel Kaed, Brice Hulse, Swaroop Mishra, Maria Georgaki, Sneha Kudugunta, Clement Farabet, Izhak Shafran, Daniel Vlasic, Anton Tsitsulin, Rajagopal Ananthanarayanan, Alen Carin, Guolong Su, Pei Sun, Shashank V, Gabriel Carvajal, Josef Broder, Iulia Comsa, Alena Repina, William Wong, Warren Weilun Chen, Peter Hawkins, Egor Filonov, Lucia Loher, Christoph Hirnschall, Weiyi Wang, Jingchen Ye, Andrea Burns, Hardie Cate, Diana Gage Wright, Federico Piccinini, Lei Zhang, Chu-Cheng Lin, Ionel Gog, Yana Kulizhskaya, Ashwin Sreevatsa, Shuang Song, Luis C. Cobo, Anand Iyer, Chetan Tekur, Guillermo Garrido, Zhuyun Xiao, Rupert Kemp, Huaixiu Steven Zheng, Hui Li, Ananth Agarwal, Christel Ngani, Kati Goshvadi, Rebeca Santamaria-Fernandez, Wojciech Fica, Xinyun Chen, Chris Gorgolewski, Sean Sun, Roopal Garg, Xinyu Ye, S. M. Ali Eslami, Nan Hua, Jon Simon, Pratik Joshi, Yelin Kim, Ian Tenney, Sahitya Potluri, Lam Nguyen Thiet, Quan Yuan, Florian Luisier, Alexandra Chronopoulou, Salvatore Scellato, Praveen Srinivasan, Minmin Chen, Vinod Koverkathu, Valentin Dalibard, Yaming Xu, Brennan Saeta, Keith Anderson, Thibault Sellam, Nick Fernando, Fantine Huot, Junehyuk Jung, Mani Varadarajan, MICHAEL QUINN, Amit Raul, Maigo Le, Ruslan Habalov, Jon Clark, Komal Jalan, Kalesha Bullard, Achintya Singhal, Thang Luong, Boyu Wang, Sujeevan Rajayogam, Julian Eisenschlos, Johnson Jia, Daniel Finchelstein, Alex Yakubovich, Daniel Balle, Michael Fink, Sameer Agarwal, Jing Li, DJ Dvijotham, Shalini Pal, Kai Kang, Jaclyn Konzelmann, Jennifer Beattie, Olivier Dousse, Diane Wu, Remi Crocker, Chen Elkind, Siddhartha Reddy Jonnalagadda, Jong Lee, Dan Holtmann-Rice, Krystal Kallarackal, Rosanne Liu, Denis Vnukov, Neera Vats, Luca Invernizzi, Mohsen Jafari, Huanjie Zhou, Lilly Taylor, Jennifer Prendki, Marcus Wu, Tom Eccles, Tianqi Liu, Kavya Kopparapu, Francoise Beaufays, Christof Angermueller, Andreea Marzoca, Shourya Sarcar, Hilal Dib, Jeff Stanway, Frank Perbet, Nejc Trdin, Rachel Sterneck, Andrey Khorlin, Dinghua Li, Xihui Wu, Sonam Goenka, David Madras, Sasha Goldshtein, Willi Gierke, Tong Zhou, Yaxin Liu, Yannie Liang, Anais White, Yunjie Li, Shreya Singh, Sanaz Bahargam, Mark Epstein, Sujoy Basu, Li Lao, Adnan Ozturel, Carl Crous, Alex Zhai, Han Lu, Zora Tung, Neeraj Gaur, Alanna Walton, Lucas Dixon, Ming Zhang, Amir Globerson, Grant Uy, Andrew Bolt, Olivia Wiles, Milad Nasr, Ilia Shumailov, Marco Selvi, Francesco Piccinno, Ricardo Aguilar, Sara McCarthy, Misha Khalman, Mrinal Shukla, Vlado Galic, John Carpenter, Kevin Villela, Haibin Zhang, Harry Richardson, James Martens, Matko Bosnjak, Shreyas Rammohan Belle, Jeff Seibert, Mahmoud Alnahlawi, Brian McWilliams, Sankalp Singh, Annie Louis, Wen Ding, Dan Popovici, Lenin Simicich, Laura Knight, Pulkit Mehta, Nishesh Gupta, Chongyang Shi, Saaber Fatehi, Jovana Mitrovic, Alex Grills, Joseph Pagadora, Tsendsuren Munkhdalai, Dessie Petrova, Danielle Eisenbud, Zhishuai Zhang, Damion Yates, Bhavishya Mittal, Nilesh Tripuraneni, Yannis Assael, Thomas Brovelli, Prateek Jain, Mihajlo Velimirovic, Canfer Akbulut, Jiaqi Mu, Wolfgang Macherey, Ravin Kumar, Jun Xu, Haroon Qureshi, Gheorghe Comanici, Jeremy Wiesner, Zhitao Gong, Anton Ruddock, Matthias Bauer, Nick Felt, Anirudh GP, Anurag Arnab, Dustin Zelle, Jonas Rothfuss, Bill Rosgen, Ashish Shenoy, Bryan Seybold, Xinjian Li, Jayaram Mudigonda, Goker Erdogan, Jiawei Xia, Jiri Simsa, Andrea Michi, Yi Yao, Christopher Yew, Steven Kan, Isaac Caswell, Carey Radebaugh, Andre Elisseeff, Pedro Valenzuela, Kay McKinney, Kim Paterson, Albert Cui, Eri Latorre-Chimoto, Solomon Kim, William Zeng, Ken Durden, Priya Ponnapalli, Tiberiu Sosea, Christopher A. Choquette-Choo, James Manyika, Brona Robenek, Harsha Vashisht, Sebastien Pereira, Hoi Lam, Marko Velic, Denese Owusu-Afriyie, Katherine Lee, Tolga Bolukbasi, Alicia Parrish, Shawn Lu, Jane Park, Balaji Venkatraman, Alice Talbert, Lambert Rosique, Yuchung Cheng, Andrei Sozanschi, Adam Paszke, Praveen Kumar, Jessica Austin, Lu Li, Khalid Salama, Bartek Perz, Wooyeol Kim, Nandita Dukkipati, Anthony Baryshnikov, Christos Kaplanis, XiangHai Sheng, Yuri Chervonyi, Caglar Unlu, Diego de Las Casas, Harry Askham, Kathryn Tunyasuvunakool, Felix Gimeno, Siim Poder, Chester Kwak, Matt Miecnikowski, Vahab Mirrokni, Alek Dimitriev, Aaron Parisi, Dangyi Liu, Tomy Tsai, Toby Shevlane, Christina Kouridi, Drew Garmon, Adrian Goedeckemeyer, Adam R. Brown, Anitha Vijayakumar, Ali Elqursh, Sadegh Jazayeri, Jin Huang, Sara Mc Carthy, Jay Hoover, Lucy Kim, Sandeep Kumar, Wei Chen, Courtney Biles, Garrett Bingham, Evan Rosen, Lisa Wang, Qijun Tan, David Engel, Francesco Pongetti, Dario de Cesare, Dongseong Hwang, Lily Yu, Jennifer Pullman, Srini Narayanan, Kyle Levin, Siddharth Gopal, Megan Li, Asaf Aharoni, Trieu Trinh, Jessica Lo, Norman Casagrande, Roopali Vij, Loic Matthey, Bramandia Ramadhana, Austin Matthews, CJ Carey, Matthew Johnson, Kremena Goranova, Rohin Shah, Shereen Ashraf, Kingshuk Dasgupta, Rasmus Larsen, Yicheng Wang, Manish Reddy Vuyyuru, Chong Jiang, Joana Ijazi, Kazuki Osawa, Celine Smith, Ramya Sree Boppana, Taylan Bilal, Yuma Koizumi, Ying Xu, Yasemin Altun, Nir Shabat, Ben Bariach, Alex Korchemniy, Kiam Choo, Olaf Ronneberger, Chimezie Iwuanyanwu, Shubin Zhao, David Soergel, Cho-Jui Hsieh, Irene Cai, Shariq Iqbal, Martin Sundermeyer, Zhe Chen, Elie Bursztein, Chaitanya Malaviya, Fadi Biadsy, Prakash Shroff, Inderjit Dhillon, Tejasi Latkar, Chris Dyer, Hannah Forbes, Massimo Nicosia, Vitaly Nikolaev, Somer Greene, Marin Georgiev, Pidong Wang, Nina Martin, Hanie Sedghi, John Zhang, Praseem Banzal, Doug Fritz, Vikram Rao, Xuezhi Wang, Jiageng Zhang, Viorica Patraucean, Dayou Du, Igor Mordatch, Ivan Jurin, Lewis Liu, Ayush Dubey, Abhi Mohan, Janek Nowakowski, Vlad-Doru Ion, Nan Wei, Reiko Tojo, Maria Abi Raad, Drew A. Hudson, Vaishakh Keshava, Shubham Agrawal, Kevin Ramirez, Zhichun Wu, Hoang Nguyen, Ji Liu, Madhavi Sewak, Bryce Petrini, DongHyun Choi, Ivan Philips, Ziyue Wang, Ioana Bica, Ankush Garg, Jarek Wilkiewicz, Priyanka Agrawal, Xiaowei Li, Danhao Guo, Emily Xue, Naseer Shaik, Andrew Leach, Sadh MNM Khan, Julia Wiesinger, Sammy Jerome, Abhishek Chakladar, Alek Wenjiao Wang, Tina Ornduff, Folake Abu, Alireza Ghaffarkhah, Marcus Wainwright, Mario Cortes, Frederick Liu, Joshua Maynez, Andreas Terzis, Pouya Samangouei, Riham Mansour, Tomasz Kępa, François-Xavier Aubet, Anton Algymr, Dan Banica, Agoston Weisz, Andras Orban, Alexandre Senges, Ewa Andrejczuk, Mark Geller, Niccolo Dal Santo, Valentin Anklin, Majd Al Merey, Martin Baeuml, Trevor Strohman, Junwen Bai, Slav Petrov, Yonghui Wu, Demis Hassabis, Koray Kavukcuoglu, Jeff Dean, Oriol Vinyals

In this report, we introduce the Gemini 1. 5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

1 Image, 2*2 Stitching Code Generation +8

RIS-empowered Topology Control for Distributed Learning in Urban Air Mobility

no code implementations8 Mar 2024 Kai Xiong, Rui Wang, Supeng Leng, Wenyang Che, Chongwen Huang, Chau Yuen

Urban Air Mobility (UAM) expands vehicles from the ground to the near-ground space, envisioned as a revolution for transportation systems.

Federated Learning MULTI-VIEW LEARNING

F$^3$Loc: Fusion and Filtering for Floorplan Localization

no code implementations5 Mar 2024 Changan Chen, Rui Wang, Christoph Vogel, Marc Pollefeys

In this paper we propose an efficient data-driven solution to self-localization within a floorplan.

Drug Resistance Predictions Based on a Directed Flag Transformer

no code implementations5 Mar 2024 Dong Chen, Gengzhuo Liu, Hongyan Du, Benjamin Jones, JunJie Wee, Rui Wang, Jiahui Chen, Jana Shen, Guo-Wei Wei

CAPTURE combines a comprehensive mutation analysis with a resistance prediction module based on DFFormer-seq, which is a novel ensemble model that leverages a new Directed Flag Transformer and sequence embeddings from the protein and small-molecule-large-language models.

Drug Discovery

Hypothesis Spaces for Deep Learning

no code implementations5 Mar 2024 Rui Wang, Yuesheng Xu, Mingsong Yan

The representer theorems unfold that solutions of these learning models can be expressed as linear combination of a finite number of kernel sessions determined by given data and the reproducing kernel.

Deep Learning

Role Prompting Guided Domain Adaptation with General Capability Preserve for Large Language Models

no code implementations5 Mar 2024 Rui Wang, Fei Mi, Yi Chen, Boyang Xue, Hongru Wang, Qi Zhu, Kam-Fai Wong, Ruifeng Xu

2) Role Prompting assigns a central prompt to the general domain and a unique role prompt to each specific domain to minimize inter-domain confusion during training.

Domain Adaptation

Logit Standardization in Knowledge Distillation

1 code implementation CVPR 2024 Shangquan Sun, Wenqi Ren, Jingzhi Li, Rui Wang, Xiaochun Cao

Knowledge distillation involves transferring soft labels from a teacher to a student using a shared temperature-based softmax function.

Knowledge Distillation

Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training

1 code implementation1 Mar 2024 Qingyan Guo, Rui Wang, Junliang Guo, Xu Tan, Jiang Bian, Yujiu Yang

Accordingly, permutation on the training data is considered as a potential solution, since this can make the model predict antecedent words or tokens.

Language Modelling

Beyond Language Models: Byte Models are Digital World Simulators

no code implementations29 Feb 2024 Shangda Wu, Xu Tan, Zili Wang, Rui Wang, Xiaobing Li, Maosong Sun

Traditional deep learning often overlooks bytes, the basic units of the digital world, where all forms of information and operations are encoded and manipulated in binary format.

Prediction

Improving Open-Ended Text Generation via Adaptive Decoding

1 code implementation28 Feb 2024 Wenhong Zhu, Hongkun Hao, Zhiwei He, Yiming Ai, Rui Wang

Current language models decode text token by token according to probabilistic distribution, and determining the appropriate candidates for the next token is crucial to ensure generation quality.

Diversity Story Generation

UniRetriever: Multi-task Candidates Selection for Various Context-Adaptive Conversational Retrieval

no code implementations26 Feb 2024 Hongru Wang, Boyang Xue, Baohang Zhou, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Kam-Fai Wong

Conversational retrieval refers to an information retrieval system that operates in an iterative and interactive manner, requiring the retrieval of various external resources, such as persona, knowledge, and even response, to effectively engage with the user and successfully complete the dialogue.

Information Retrieval Retrieval

Low-Frequency Black-Box Backdoor Attack via Evolutionary Algorithm

no code implementations23 Feb 2024 Yanqi Qiao, Dazhuang Liu, Rui Wang, Kaitai Liang

Extensive experiments on real-world datasets verify the effectiveness and robustness of LFBA against image processing operations and the state-of-the-art backdoor defenses, as well as its inherent stealthiness in both spatial and frequency space, making it resilient against frequency inspection.

Backdoor Attack

Is Self-knowledge and Action Consistent or Not: Investigating Large Language Model's Personality

no code implementations22 Feb 2024 Yiming Ai, Zhiwei He, Ziyin Zhang, Wenhong Zhu, Hongkun Hao, Kai Yu, Lingjun Chen, Rui Wang

In this study, we delve into the validity of conventional personality questionnaires in capturing the human-like personality traits of Large Language Models (LLMs).

Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models

1 code implementation21 Feb 2024 Zhiwei He, Binglin Zhou, Hongkun Hao, Aiwei Liu, Xing Wang, Zhaopeng Tu, Zhuosheng Zhang, Rui Wang

Furthermore, we analyze two key factors that contribute to the cross-lingual consistency in text watermarking and propose X-SIR as a defense method against CWRA.

TAG

A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models

1 code implementation21 Feb 2024 Boyang Xue, Hongru Wang, Rui Wang, Sheng Wang, Zezhong Wang, Yiming Du, Bin Liang, Kam-Fai Wong

This paper addresses this gap by introducing a comprehensive investigation of Multilingual Confidence estimation (MlingConf) on LLMs, focusing on both language-agnostic (LA) and language-specific (LS) tasks to explore the performance and language dominance effects of multilingual confidence estimations on different tasks.

Unsupervised Sign Language Translation and Generation

no code implementations12 Feb 2024 Zhengsheng Guo, Zhiwei He, Wenxiang Jiao, Xing Wang, Rui Wang, Kehai Chen, Zhaopeng Tu, Yong Xu, Min Zhang

Motivated by the success of unsupervised neural machine translation (UNMT), we introduce an unsupervised sign language translation and generation network (USLNet), which learns from abundant single-modality (text and video) data without parallel sign language data.

Machine Translation Sign Language Translation +1

A Novel Paradigm in Solving Multiscale Problems

no code implementations7 Feb 2024 Jing Wang, Zheng Li, Pengyu Lai, Rui Wang, Di Yang, Dewu Yang, Hui Xu, Wen-Quan Tao

By enabling the acquisition of large-scale data with minimal computational demands, coupled with the efficient and accurate characterization of small-scale dynamics via Spectral PINN, our approach offers a valuable and promising approach for researchers seeking to tackle multiscale phenomena effectively.

Partial Identification of Binary Choice Models with Misreported Outcomes

no code implementations30 Jan 2024 Orville Mondal, Rui Wang

In the first approach, the instrument is assumed to only affect the true dependent variable but not misreporting probabilities.

UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems

no code implementations24 Jan 2024 Hongru Wang, WenYu Huang, Yang Deng, Rui Wang, Zezhong Wang, YuFei Wang, Fei Mi, Jeff Z. Pan, Kam-Fai Wong

To better plan and incorporate the use of multiple sources in generating personalized response, we firstly decompose it into three sub-tasks: Knowledge Source Selection, Knowledge Retrieval, and Response Generation.

RAG Response Generation +1

Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model

1 code implementation23 Jan 2024 Zhiwei He, Xing Wang, Wenxiang Jiao, Zhuosheng Zhang, Rui Wang, Shuming Shi, Zhaopeng Tu

In this work, we investigate the potential of employing the QE model as the reward model to predict human preferences for feedback training.

Machine Translation Translation

T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration

no code implementations19 Jan 2024 Chuxiong Sun, Zehua Zang, Jiabao Li, Jiangmeng Li, Xiao Xu, Rui Wang, Changwen Zheng

This process enables agents to collectively use evidence garnered from multiple perspectives, fostering trusted and cooperative behaviors.

SMAC+

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents

1 code implementation18 Jan 2024 Tongxin Yuan, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu, Binglin Zhou, Fangqi Li, Zhuosheng Zhang, Rui Wang, Gongshen Liu

We introduce R-Judge, a benchmark crafted to evaluate the proficiency of LLMs in judging and identifying safety risks given agent interaction records.

Benchmarking

DiffusionGPT: LLM-Driven Text-to-Image Generation System

no code implementations18 Jan 2024 Jie Qin, Jie Wu, Weifeng Chen, Yuxi Ren, Huixia Li, Hefeng Wu, Xuefeng Xiao, Rui Wang, Shilei Wen

Diffusion models have opened up new avenues for the field of image generation, resulting in the proliferation of high-quality models shared on open-source platforms.

Model Selection Text-to-Image Generation

Passive Beamforming For Practical RIS-Assisted Communication Systems With Non-Ideal Hardware

no code implementations15 Jan 2024 Yiming Liu, Rui Wang, Zhu Han

Reconfigurable intelligent surface (RIS) technology is a promising solution to improve the performance of existing wireless communications.

Toward distortion-aware change detection in realistic scenarios

no code implementations10 Jan 2024 Yitao Zhao, Heng-Chao Li, Nanqing Liu, Rui Wang

The whole framework is composed of Pretext Representation Pre-training, Bitemporal Image Alignment, and Down-stream Decoder Fine-Tuning.

Change Detection Decoder

SEED-Bench: Benchmarking Multimodal Large Language Models

1 code implementation CVPR 2024 Bohao Li, Yuying Ge, Yixiao Ge, Guangzhi Wang, Rui Wang, Ruimao Zhang, Ying Shan

Multimodal large language models (MLLMs) building upon the foundation of powerful large language models (LLMs) have recently demonstrated exceptional capabilities in generating not only texts but also images given interleaved multimodal inputs (acting like a combination of GPT-4V and DALL-E 3).

Benchmarking Image Generation +1

F3Loc: Fusion and Filtering for Floorplan Localization

no code implementations CVPR 2024 Changan Chen, Rui Wang, Christoph Vogel, Marc Pollefeys

In this paper we propose an efficient data-driven solution to self-localization within a floorplan.

Boosting Large Language Model for Speech Synthesis: An Empirical Study

no code implementations30 Dec 2023 Hongkun Hao, Long Zhou, Shujie Liu, Jinyu Li, Shujie Hu, Rui Wang, Furu Wei

In this paper, we conduct a comprehensive empirical exploration of boosting LLMs with the ability to generate speech, by combining pre-trained LLM LLaMA/OPT and text-to-speech synthesis model VALL-E. We compare three integration methods between LLMs and speech synthesis models, including directly fine-tuned LLMs, superposed layers of LLMs and VALL-E, and coupled LLMs and VALL-E using LLMs as a powerful text encoder.

Language Modeling Language Modelling +5

Identification of Nonlinear Dynamic Panels under Partial Stationarity

no code implementations30 Dec 2023 Wayne Yuan Gao, Rui Wang

This paper studies identification for a wide range of nonlinear panel data models, including binary choice, ordered response, and other types of limited dependent variable models.

Exploring 3D-aware Lifespan Face Aging via Disentangled Shape-Texture Representations

no code implementations28 Dec 2023 Qianrui Teng, Rui Wang, Xing Cui, Peipei Li, Zhaofeng He

Existing face aging methods often focus on modeling either texture aging or using an entangled shape-texture representation to achieve face aging.

3D Face Reconstruction Texture Synthesis

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

1 code implementation CVPR 2024 Yuxuan Zhang, Yiren Song, Jiaming Liu, Rui Wang, Jinpeng Yu, Hao Tang, Huaxia Li, Xu Tang, Yao Hu, Han Pan, Zhongliang Jing

Recent advancements in subject-driven image generation have led to zero-shot generation, yet precise selection and focus on crucial subject representations remain challenging.

Image Generation

Reconfigurable Intelligent Surface-Assisted Localization in OFDM Systems with Carrier Frequency Offset and Phase Noise

no code implementations19 Dec 2023 Hanfu Zhang, Erwu Liu, Rui Wang, Wei Ni, Zhe Xing, Yan Liu, Abbas Jamalipour

The localization accuracy of the proposed algorithm is close to the analytical lower bound, with a root mean square error of lower than $\rm 10^{-2} \: m$.

Position

Rethinking Dimensional Rationale in Graph Contrastive Learning from Causal Perspective

1 code implementation16 Dec 2023 Qirui Ji, Jiangmeng Li, Jie Hu, Rui Wang, Changwen Zheng, Fanjiang Xu

To this end, with the purpose of exploring the intrinsic rationale of graphs, we accordingly propose to capture the dimensional rationale from graphs, which has not received sufficient attention in the literature.

Contrastive Learning Meta-Learning

Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation

no code implementations12 Dec 2023 Yuanbin Wang, Shaofei Huang, Yulu Gao, Zhen Wang, Rui Wang, Kehua Sheng, Bo Zhang, Si Liu

In this work, we focus on zero-shot point cloud semantic segmentation and propose a simple yet effective baseline to transfer the visual-linguistic knowledge implied in CLIP to point cloud encoder at both feature and output levels.

3D Semantic Segmentation Point Cloud Segmentation +2

Vision-language Assisted Attribute Learning

no code implementations12 Dec 2023 Kongming Liang, Xinran Wang, Rui Wang, Donghui Gao, Ling Jin, Weidong Liu, Xiatian Zhu, Zhanyu Ma, Jun Guo

Attribute labeling at large scale is typically incomplete and partial, posing significant challenges to model optimization.

Attribute Language Modeling +3

DiffCast: A Unified Framework via Residual Diffusion for Precipitation Nowcasting

1 code implementation CVPR 2024 Demin Yu, Xutao Li, Yunming Ye, Baoquan Zhang, Chuyao Luo, Kuai Dai, Rui Wang, Xunlai Chen

A unified and flexible framework that can equip any type of spatio-temporal models is proposed based on residual diffusion, which effectively tackles the shortcomings of previous methods.

EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning

1 code implementation11 Dec 2023 Yi Chen, Yuying Ge, Yixiao Ge, Mingyu Ding, Bohao Li, Rui Wang, Ruifeng Xu, Ying Shan, Xihui Liu

The pursuit of artificial general intelligence (AGI) has been accelerated by Multimodal Large Language Models (MLLMs), which exhibit superior reasoning, generalization capabilities, and proficiency in processing multimodal inputs.

Benchmarking Human-Object Interaction Detection +1

Unsupervised Social Event Detection via Hybrid Graph Contrastive Learning and Reinforced Incremental Clustering

1 code implementation8 Dec 2023 Yuanyuan Guo, Zehua Zang, Hang Gao, Xiao Xu, Rui Wang, Lixiang Liu, Jiangmeng Li

To this end, recent works explore learning discriminative information from social messages by leveraging graph contrastive learning (GCL) and embedding clustering in an unsupervised manner.

Clustering Contrastive Learning +1

Multi-scale Residual Transformer for VLF Lightning Transients Classification

no code implementations7 Dec 2023 Jinghao Sun, Tingting Ji, Guoyu Wang, Rui Wang

The utilization of Very Low Frequency (VLF) electromagnetic signals in navigation systems is widespread.

Classification

FaceStudio: Put Your Face Everywhere in Seconds

no code implementations5 Dec 2023 Yuxuan Yan, Chi Zhang, Rui Wang, Yichao Zhou, Gege Zhang, Pei Cheng, Gang Yu, Bin Fu

This study investigates identity-preserving image synthesis, an intriguing task in image generation that seeks to maintain a subject's identity while adding a personalized, stylistic touch.

Image Generation

VIoTGPT: Learning to Schedule Vision Tools in LLMs towards Intelligent Video Internet of Things

1 code implementation1 Dec 2023 Yaoyao Zhong, Mengshi Qi, Rui Wang, Yuhan Qiu, Yang Zhang, Huadong Ma

Video Internet of Things (VIoT) has shown full potential in collecting an unprecedented volume of video data.

SEED-Bench-2: Benchmarking Multimodal Large Language Models

2 code implementations28 Nov 2023 Bohao Li, Yuying Ge, Yixiao Ge, Guangzhi Wang, Rui Wang, Ruimao Zhang, Ying Shan

Multimodal large language models (MLLMs), building upon the foundation of powerful large language models (LLMs), have recently demonstrated exceptional capabilities in generating not only texts but also images given interleaved multimodal inputs (acting like a combination of GPT-4V and DALL-E 3).

Benchmarking Image Generation +1

Riemannian Self-Attention Mechanism for SPD Networks

no code implementations28 Nov 2023 Rui Wang, Xiao-Jun Wu, Hui Li, Josef Kittler

Symmetric positive definite (SPD) matrix has been demonstrated to be an effective feature descriptor in many scientific areas, as it can encode spatiotemporal statistics of the data adequately on a curved Riemannian manifold, i. e., SPD manifold.

Benchmarking Riemannian optimization

Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning

1 code implementation26 Nov 2023 Jiaqi Li, Yuanhao Lai, Rui Wang, Changjian Shui, Sabyasachi Sahoo, Charles X. Ling, Shichun Yang, Boyu Wang, Christian Gagné, Fan Zhou

We conduct extensive experiments on various benchmarks, including a dataset with large-scale tasks, and compare our method against some recent state-of-the-art methods to demonstrate the effectiveness and scalability of our proposed method.

Continual Learning

Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

1 code implementation20 Nov 2023 Zhuosheng Zhang, Yao Yao, Aston Zhang, Xiangru Tang, Xinbei Ma, Zhiwei He, Yiming Wang, Mark Gerstein, Rui Wang, Gongshen Liu, Hai Zhao

Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demonstrably evidenced by their formidable empirical performance across a spectrum of complex reasoning tasks.

MELA: Multilingual Evaluation of Linguistic Acceptability

1 code implementation15 Nov 2023 Ziyin Zhang, Yikang Liu, Weifang Huang, Junyu Mao, Rui Wang, Hai Hu

In this work, we present the largest benchmark to date on linguistic acceptability: Multilingual Evaluation of Linguistic Acceptability -- MELA, with 46K samples covering 10 languages from a diverse set of language families.

Code Generation Cross-Lingual Transfer +3

Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code

1 code implementation14 Nov 2023 Ziyin Zhang, Chaoyu Chen, Bingchang Liu, Cong Liao, Zi Gong, Hang Yu, Jianguo Li, Rui Wang

In this work we systematically review the recent advancements in software engineering with language models, covering 70+ models, 40+ evaluation tasks, 180+ datasets, and 900 related works.

Language Model Evaluation Language Modeling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.