Search Results for author: Wei Chen

Found 555 papers, 166 papers with code

A Structure-Aware Argument Encoder for Literature Discourse Analysis

1 code implementation COLING 2022 Yinzi Li, Wei Chen, Zhongyu Wei, Yujun Huang, Chujun Wang, Siyuan Wang, Qi Zhang, Xuanjing Huang, Libo Wu

Existing research for argument representation learning mainly treats tokens in the sentence equally and ignores the implied structure information of argumentative context.

Position Representation Learning +1

Dual Refinement Underwater Object Detection Network

no code implementations ECCV 2020 Baojie Fan, Wei Chen, Yang Cong, Jiandong Tian

Due to the complex underwater environment, underwater imaging often encounters some problems such as blur, scale variation, color shift, and texture distortion.

Object object-detection +1

Combinatorial Pure Exploration for Dueling Bandit

no code implementations ICML 2020 Wei Chen, Yihan Du, Longbo Huang, Haoyu Zhao

For Borda winner, we establish a reduction of the problem to the original CPE-MAB setting and design PAC and exact algorithms that achieve both the sample complexity similar to that in the CPE-MAB setting (which is nearly optimal for a subclass of problems) and polynomial running time per round.

Position

Uplink Assisted Joint Channel Estimation and CSI Feedback: An Approach Based on Deep Joint Source-Channel Coding

no code implementations15 Apr 2025 Yiran Guo, Wei Chen, Bo Ai

In frequency division duplex (FDD) multiple-input multiple-output (MIMO) wireless communication systems, the acquisition of downlink channel state information (CSI) is essential for maximizing spatial resource utilization and improving system spectral efficiency.

AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis

no code implementations14 Apr 2025 Dan Luo, Chengyuan Ma, Weiqin Li, Jun Wang, Wei Chen, Zhiyong Wu

This scheme uses embeddings, extracted by Llama, PER-LLM-Embedder, and Moka, to match with samples in the knowledge database, selecting the most appropriate speech style for synthesis.

RAG Speech Synthesis +2

MiMu: Mitigating Multiple Shortcut Learning Behavior of Transformers

no code implementations14 Apr 2025 Lili Zhao, Qi Liu, Wei Chen, Liyi Chen, Ruijun Sun, Min Hou, Yang Wang, Shijin Wang

Then, we further design self-improvement strategy in target model to reduce the reliance on multiple shortcuts.

Hyperbolic Diffusion Recommender Model

no code implementations2 Apr 2025 Meng Yuan, Yutian Xiao, Wei Chen, Chu Zhao, Deqing Wang, Fuzhen Zhuang

To gain deeper insights into the limitations of diffusion models in recommender systems, we investigate the fundamental structural disparities between images and items.

model Recommendation Systems

Correlation-Attention Masked Temporal Transformer for User Identity Linkage Using Heterogeneous Mobility Data

no code implementations28 Mar 2025 Ziang Yan, Xingyu Zhao, Hanqing Ma, Wei Chen, Jianpeng Qi, Yanwei Yu, Junyu Dong

With the rise of social media and Location-Based Social Networks (LBSN), check-in data across platforms has become crucial for User Identity Linkage (UIL).

Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models

no code implementations24 Mar 2025 Zichen Miao, Wei Chen, Qiang Qiu

In this paper, we propose to tune the large pre-trained transformers by learning a small set of combination coefficients that construct a more expressive filter subspace from the original multi-head attention maps.

parameter-efficient fine-tuning Tensor Decomposition

Hierarchy-Aware and Channel-Adaptive Semantic Communication for Bandwidth-Limited Data Fusion

no code implementations22 Mar 2025 Lei Guo, Wei Chen, Yuxuan Sun, Bo Ai, Nikolaos Pappas, Tony Quek

Obtaining high-resolution hyperspectral images (HR-HSI) is costly and data-intensive, making it necessary to fuse low-resolution hyperspectral images (LR-HSI) with high-resolution RGB images (HR-RGB) for practical applications.

Semantic Communication Super-Resolution

Growing a Twig to Accelerate Large Vision-Language Models

no code implementations18 Mar 2025 Zhenwei Shao, Mingyang Wang, Zhou Yu, Wenwen Pan, Yan Yang, Tao Wei, Hongyuan Zhang, Ning Mao, Wei Chen, Jun Yu

Despite the success of these token pruning methods, they still suffer from two major shortcomings: (i) considerable accuracy drop due to insensitive attention signals in early layers, and (ii) limited speedup when generating long responses (e. g., 30 tokens).

Bridging Social Psychology and LLM Reasoning: Conflict-Aware Meta-Review Generation via Cognitive Alignment

no code implementations18 Mar 2025 Wei Chen, Han Ding, Meng Yuan, Zhao Zhang, Deqing Wang, Fuzhen Zhuang

The rapid growth of scholarly submissions has overwhelmed traditional peer review systems, driving the need for intelligent automation to preserve scientific rigor.

Review Generation

Semantic Latent Motion for Portrait Video Generation

no code implementations13 Mar 2025 Qiyuan Zhang, Chenyu Wu, Wenzhang Sun, Huaize Liu, Donglin Di, Wei Chen, Changqing Zou

Second, in the Reasoning step, long-term modeling and efficient reasoning are performed in this latent space to generate motion sequences.

Descriptive Video Generation

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

1 code implementation13 Mar 2025 Yi Yang, Xiaoxuan He, Hongkun Pan, Xiyan Jiang, Yan Deng, Xingtao Yang, Haoyu Lu, Dacheng Yin, Fengyun Rao, Minfeng Zhu, Bo Zhang, Wei Chen

Existing visual-language models often struggle to effectively analyze and reason visual content, resulting in suboptimal performance on complex reasoning tasks.

Multimodal Reasoning

Extra Clients at No Extra Cost: Overcome Data Heterogeneity in Federated Learning with Filter Decomposition

no code implementations11 Mar 2025 Wei Chen, Qiang Qiu

Data heterogeneity is one of the major challenges in federated learning (FL), which results in substantial client variance and slow convergence.

Federated Learning

FedEM: A Privacy-Preserving Framework for Concurrent Utility Preservation in Federated Learning

no code implementations8 Mar 2025 Mingcong Xu, Xiaojin Zhang, Wei Chen, Hai Jin

Federated Learning (FL) enables collaborative training of models across distributed clients without sharing local data, addressing privacy concerns in decentralized systems.

Federated Learning Privacy Preserving

Deep Joint CSI Estimation-Feedback-Precoding for MU-MIMO OFDM Systems

no code implementations6 Mar 2025 Yiran Guo, Wei Chen, Bo Ai, Lun Li

Furthermore, compared to conventional separate architecture, the proposed network architecture with joint architecture reduces the computational burden and model storage overhead at the UE side, facilitating the deployment of low-overhead multi-module joint architectures in practice.

GraphGarment: Learning Garment Dynamics for Bimanual Cloth Manipulation Tasks

no code implementations4 Mar 2025 Wei Chen, Kelin Li, Dongmyoung Lee, Xiaoshuai Chen, Rui Zong, Petar Kormushev

In simulation experiments, GraphGarment achieves better garment state prediction performance, with a prediction error 0. 46 cm lower than the best baseline.

Graph Neural Network

LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation

no code implementations25 Feb 2025 Pengzhi Li, Pengfei Yu, Zide Liu, wei he, Xuhao Pan, Xudong Rao, Tao Wei, Wei Chen

In this paper, we introduce LDGen, a novel method for integrating large language models (LLMs) into existing text-to-image diffusion models while minimizing computational demands.

Image Generation Language Modeling +2

CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification

1 code implementation25 Feb 2025 Mingkun Zhang, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng

We propose two variants for our CLIPure approach: CLIPure-Diff which models the likelihood of images' latent vectors with the DiffusionPrior module in DaLLE-2 (modeling the generation process of CLIP's latent vectors), and CLIPure-Cos which models the likelihood with the cosine similarity between the embeddings of an image and ``a photo of a.''.

Denoising Zero-Shot Learning

A Deep Learning Framework with Geographic Information Adaptive Loss for Remote Sensing Images based UAV Self-Positioning

no code implementations22 Feb 2025 Mingkun Li, ZiMing Wang, Guang Huo, Wei Chen, Xiaoning Zhao

Some methods obtain geolocation information in GPS-denied environments by matching ground objects in the UAV viewpoint with remote sensing images.

geo-localization

Continuous K-Max Bandits

no code implementations19 Feb 2025 Yu Chen, Siwei Wang, Longbo Huang, Wei Chen

The continuous $K$-Max bandits introduce unique challenges, including discretization error from continuous-to-discrete conversion, non-deterministic tie-breaking under limited feedback, and biased estimation due to partial observability.

Distributed Computing Multi-Armed Bandits +2

FedEAT: A Robustness Optimization Framework for Federated LLMs

no code implementations17 Feb 2025 Yahao Pang, Xingyuan Wu, Xiaojin Zhang, Wei Chen, Hai Jin

Significant advancements have been made by Large Language Models (LLMs) in the domains of natural language understanding and automated content creation.

Federated Learning Natural Language Understanding

NOMANet: A Graph Neural Network Enabled Power Allocation Scheme for NOMA

no code implementations8 Feb 2025 Yipu Hou, Yang Lu, Wei Chen, Bo Ai, Dusit Niyato, Zhiguo Ding

This paper proposes a graph neural network (GNN) enabled power allocation scheme for non-orthogonal multiple access (NOMA) networks.

Graph Neural Network

Graph Neural Network Enabled Fluid Antenna Systems: A Two-Stage Approach

no code implementations6 Feb 2025 Changpeng He, Yang Lu, Wei Chen, Bo Ai, Kai-Kit Wong, Dusit Niyato

An emerging fluid antenna system (FAS) brings a new dimension, i. e., the antenna positions, to deal with the deep fading, but simultaneously introduces challenges related to the transmit design.

Graph Neural Network

SWIPTNet: A Unified Deep Learning Framework for SWIPT based on GNN and Transfer Learning

no code implementations6 Feb 2025 Hong Han, Yang Lu, Zihan Song, Ruichen Zhang, Wei Chen, Bo Ai, Dusit Niyato, Dong In Kim

The quality-of-service (QoS) constrained sum-rate maximization problems are, respectively, formulated for power-splitting (PS) receivers and time-switching (TS) receivers and solved by a unified graph neural network (GNN) based model termed SWIPT net (SWIPTNet).

Graph Neural Network Transfer Learning

Offline Learning for Combinatorial Multi-armed Bandits

no code implementations31 Jan 2025 Xutong Liu, Xiangxiang Dai, Jinhang Zuo, Siwei Wang, Carlee-Joe Wong, John C. S. Lui, Wei Chen

The combinatorial multi-armed bandit (CMAB) is a fundamental sequential decision-making framework, extensively studied over the past decade.

Decision Making Language Modeling +5

InsQABench: Benchmarking Chinese Insurance Domain Question Answering with Large Language Models

1 code implementation19 Jan 2025 Jing Ding, Kai Feng, Binbin Lin, Jiarui Cai, Qiushi Wang, Yu Xie, Xiaojin Zhang, Zhongyu Wei, Wei Chen

The application of large language models (LLMs) has achieved remarkable success in various fields, but their effectiveness in specialized domains like the Chinese insurance industry remains underexplored.

Benchmarking Question Answering +1

Uncertainty-Aware Digital Twins: Robust Model Predictive Control using Time-Series Deep Quantile Learning

no code implementations17 Jan 2025 Yi-Ping Chen, Ying-Kuan Tsai, Vispi Karkaria, Wei Chen

This proactive while uncertainty-aware control capability positions the proposed method as a potent tool for future Digital Twin applications and real-time process control in engineering systems.

Decision Making Model Predictive Control +3

What Limits LLM-based Human Simulation: LLMs or Our Design?

no code implementations15 Jan 2025 Qian Wang, Jiaying Wu, Zhenheng Tang, Bingqiao Luo, Nuo Chen, Wei Chen, Bingsheng He

We argue that advancing LLM-based human simulation requires addressing both LLM's inherent limitations and simulation framework design challenges.

ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting

no code implementations11 Jan 2025 Steven H. Wang, Maksim Zubkov, Kexin Fan, Sarah Harrell, Yuyang Sun, Wei Chen, Andreas Plesner, Roger Wattenhofer

Information retrieval, specifically contract clause retrieval, is foundational to contract drafting because lawyers rarely draft contracts from scratch; instead, they locate and revise the most relevant precedent.

Information Retrieval Retrieval

Real-Time Decision-Making for Digital Twin in Additive Manufacturing with Model Predictive Control using Time-Series Deep Neural Networks

no code implementations10 Jan 2025 Yi-Ping Chen, Vispi Karkaria, Ying-Kuan Tsai, Faith Rolark, Daniel Quispe, Robert X. Gao, Jian Cao, Wei Chen

Using Directed Energy Deposition (DED) additive manufacturing as a case study, we demonstrate the effectiveness of the proposed MPC in achieving melt pool temperature tracking to ensure part quality, while reducing porosity defects by regulating laser power to maintain melt pool depth constraints.

Decision Making Model Predictive Control +3

COLOR: A compositional linear operation-based representation of protein sequences for identification of monomer contributions to properties

no code implementations10 Jan 2025 Akash Pandey, Wei Chen, Sinan Keten

The properties of biological materials like proteins and nucleic acids are largely determined by their primary sequence.

Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data

no code implementations5 Jan 2025 Chao Liang, Linchao Zhu, Zongxin Yang, Wei Chen, Yi Yang

On the other hand, the relation modeling between noisy and clean images is not learned for the class prototype generation in an end-to-end manner, which results in a suboptimal class prototype.

Make Domain Shift a Catastrophic Forgetting Alleviator in Class-Incremental Learning

no code implementations31 Dec 2024 Wei Chen, Yi Zhou

This paper discovers a counter-intuitive observation: by incorporating domain shift into CIL tasks, the forgetting rate is significantly reduced.

class-incremental learning Class Incremental Learning +2

Do Current Video LLMs Have Strong OCR Abilities? A Preliminary Study

1 code implementation29 Dec 2024 Yulin Fei, Yuhui Gao, Xingyuan Xian, Xiaojin Zhang, Tao Wu, Wei Chen

With the rise of multimodal large language models, accurately extracting and understanding textual information from video content, referred to as video based optical character recognition (Video OCR), has become a crucial capability.

Motion Detection Optical Character Recognition +2

Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data

no code implementations23 Dec 2024 Hengrui Zhang, Alexandru B. Georgescu, Suraj Yerramilli, Christopher Karpovich, Daniel W. Apley, Elsa A. Olivetti, James M. Rondinelli, Wei Chen

In this Account, we review a team effort toward establishing a framework that integrates data-driven and physics-based methods to address these challenges and accelerate materials design.

STKDRec: Spatial-Temporal Knowledge Distillation for Takeaway Recommendation

no code implementations21 Dec 2024 Shuyuan Zhao, Wei Chen, Boyan Shi, Liyong Zhou, Shuohao Lin, Huaiyu Wan

During the second STKD stage, a spatial-temporal Transformer is employed to comprehensively model dynamic user preferences on various types of fine-grained geospatial information from a sequence perspective.

Knowledge Distillation Knowledge Graphs

CognTKE: A Cognitive Temporal Knowledge Extrapolation Framework

1 code implementation21 Dec 2024 Wei Chen, Yuting Wu, Shuhan Wu, ZhiYu Zhang, Mengqi Liao, Youfang Lin, Huaiyu Wan

Reasoning future unknowable facts on temporal knowledge graphs (TKGs) is a challenging task, holding significant academic and practical values for various fields.

Knowledge Graphs Relation

Security and Privacy of Digital Twins for Advanced Manufacturing: A Survey

no code implementations18 Dec 2024 Alexander D. Zemskov, Yao Fu, Runchao Li, Xufei Wang, Vispi Karkaria, Ying-Kuan Tsai, Wei Chen, Jianjing Zhang, Robert Gao, Jian Cao, Kenneth A. Loparo, Pan Li

In Industry 4. 0, the digital twin is one of the emerging technologies, offering simulation abilities to predict, refine, and interpret conditions and operations, where it is crucial to emphasize a heightened concentration on the associated security and privacy risks.

LLMs Can Simulate Standardized Patients via Agent Coevolution

1 code implementation16 Dec 2024 Zhuoyun Du, Lujie Zheng, Renjun Hu, Yuyang Xu, Xiawei Li, Ying Sun, Wei Chen, Jian Wu, Haolei Cai, Haohao Ying

Training medical personnel using standardized patients (SPs) remains a complex challenge, requiring extensive domain expertise and role-specific practice.

Diagnostic Language Modeling +2

Smoothness Really Matters: A Simple Yet Effective Approach for Unsupervised Graph Domain Adaptation

1 code implementation16 Dec 2024 Wei Chen, Guo Ye, Yakun Wang, Zhao Zhang, Libang Zhang, Daixin Wang, Zhiqiang Zhang, Fuzhen Zhuang

Given the sensitivity of GNNs to local structural features, even slight discrepancies between source and target graphs could lead to significant shifts in node embeddings, thereby reducing the effectiveness of knowledge transfer.

Domain Adaptation GRAPH DOMAIN ADAPTATION +1

Asynchronous Random Access in Massive MIMO Systems Facilitated by the Delay-Angle Domain

no code implementations6 Dec 2024 Ao Chen, Wei Chen, Bo Ai, Petar Popovski

This paper explores contention-based random access (CBRA) schemes for asynchronous access in massive multiple-input multiple-output (MIMO) systems.

Action Detection Activity Detection

DataLab: A Unified Platform for LLM-Powered Business Intelligence

no code implementations3 Dec 2024 Luoxuan Weng, Yinghao Tang, Yingchaojie Feng, Zhuo Chang, Ruiqin Chen, Haozhe Feng, Chen Hou, Danqing Huang, Yang Li, Huaming Rao, Haonan Wang, Canshi Wei, Xiaofeng Yang, Yuhui Zhang, Yifeng Zheng, Xiuqi Huang, Minfeng Zhu, Yuxin Ma, Bin Cui, Peng Chen, Wei Chen

To achieve this unification, we design a domain knowledge incorporation module tailored for enterprise-specific BI tasks, an inter-agent communication mechanism to facilitate information sharing across the BI workflow, and a cell-based context management strategy to enhance context utilization efficiency in BI notebooks.

Large Language Model Task Planning

Combinatorial Rising Bandit

no code implementations1 Dec 2024 Seockbean Song, Youngsik Yoon, Siwei Wang, Wei Chen, Jungseul Ok

However, we provide the sub-linear regret lower bound for combinatorial rising bandit and show that CRUCB is provably efficient by showing that the regret upper bound is close to the regret lower bound.

Deep Reinforcement Learning Recommendation Systems

Learning Rate-Compatible Linear Block Codes: An Auto-Encoder Based Approach

no code implementations27 Nov 2024 Yukun Cheng, Wei Chen, Tianwei Hou, Geoffrey Ye Li, Bo Ai

The coding process associated with AI or non-AI decoders and multiple puncturing patterns is optimized in a data-driven manner.

Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting

no code implementations14 Nov 2024 Weilin Ruan, Wenzhuo Wang, Siru Zhong, Wei Chen, Li Liu, Yuxuan Liang

In this paper, we introduce the Spatio-Temporal Unitized Model (STUM), a unified framework designed to capture both spatial and temporal dependencies while addressing spatio-temporal heterogeneity through techniques such as distribution alignment and feature fusion.

Computational Efficiency Hyperparameter Optimization

FedDTPT: Federated Discrete and Transferable Prompt Tuning for Black-Box Large Language Models

no code implementations1 Nov 2024 Jiaqi Wu, Simin Chen, Yuzhe Yang, Yijiang Li, Shiyue Hou, Rui Jing, Zehua Wang, Wei Chen, Zijian Tian

To address these challenges, we propose for the first time a federated discrete and transferable prompt tuning, namely FedDTPT, for black-box large language models.

Federated Learning Semantic Similarity +1

Fast and scalable Wasserstein-1 neural optimal transport solver for single-cell perturbation prediction

1 code implementation1 Nov 2024 Yanshuo Chen, Zhengmian Hu, Wei Chen, Heng Huang

Our experiments demonstrate that the proposed $W_1$ neural optimal transport solver can mimic the $W_2$ OT solvers in finding a unique and ``monotonic" map on 2D datasets.

Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models

1 code implementation31 Oct 2024 Jianqun Zhou, Yuanlei Zheng, Wei Chen, Qianqian Zheng, Hui Su, Wei zhang, Rui Meng, Xiaoyu Shen

Instruction-following capabilities in LLMs have progressed significantly, enabling more complex user interactions through detailed prompts.

Instruction Following Retrieval

RSL-SQL: Robust Schema Linking in Text-to-SQL Generation

1 code implementation31 Oct 2024 Zhenbiao Cao, Yuanlei Zheng, Zhihao Fan, Xiaojin Zhang, Wei Chen, Xiang Bai

Text-to-SQL generation aims to translate natural language questions into SQL statements.

Text-To-SQL

CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense

1 code implementation30 Oct 2024 Mingkun Zhang, Keping Bi, Wei Chen, Quanrun Chen, Jiafeng Guo, Xueqi Cheng

Despite ongoing efforts to defend neural classifiers from adversarial attacks, they remain vulnerable, especially to unseen attacks.

Adversarial Defense Disentanglement +1

KANsformer for Scalable Beamforming

no code implementations28 Oct 2024 Xinke Xie, Yang Lu, Chong-Yung Chi, Wei Chen, Bo Ai, Dusit Niyato

This paper proposes an unsupervised deep-learning (DL) approach by integrating transformer and Kolmogorov-Arnold networks (KAN) termed KANsformer to realize scalable beamforming for mobile communication systems.

Kolmogorov-Arnold Networks Transfer Learning

FairDgcl: Fairness-aware Recommendation with Dynamic Graph Contrastive Learning

1 code implementation23 Oct 2024 Wei Chen, Meng Yuan, Zhao Zhang, Ruobing Xie, Fuzhen Zhuang, Deqing Wang, Rui Liu

Specifically, we propose FairDgcl, a dynamic graph adversarial contrastive learning framework aiming at improving fairness in recommender system.

Contrastive Learning Data Augmentation +2

Expand and Compress: Exploring Tuning Principles for Continual Spatio-Temporal Graph Forecasting

no code implementations16 Oct 2024 Wei Chen, Yuxuan Liang

The widespread deployment of sensing devices leads to a surge in data for spatio-temporal forecasting applications such as traffic flow, air quality, and wind energy.

Graph Neural Network Spatio-Temporal Forecasting

TV-3DG: Mastering Text-to-3D Customized Generation with Visual Prompt

no code implementations16 Oct 2024 Jiahui Yang, Donglin Di, Baorui Ma, Xun Yang, Yongjia Ma, Wenzhang Sun, Wei Chen, Jianxun Cui, Zhou Xue, Meng Wang, Yebin Liu

To address this, we propose a novel algorithm, Classifier Score Matching (CSM), which removes the difference term in SDS and uses a deterministic noise addition process to reduce noise during optimization, effectively overcoming the low-quality limitations of SDS in our customized generation framework.

3D Generation Text to 3D

Learning to Customize Text-to-Image Diffusion In Diverse Context

no code implementations14 Oct 2024 Taewook Kim, Wei Chen, Qiang Qiu

This often results in the model becoming overfitted to these training images and unable to generalize to new contexts in future text prompts.

Self-Supervised Learning

ChartKG: A Knowledge-Graph-Based Representation for Chart Images

no code implementations13 Oct 2024 Zhiguang Zhou, Haoxuan Wang, Zhengqing Zhao, Fengling Zheng, Yongheng Wang, Wei Chen, Yong Wang

We present four cases to illustrate how our knowledge-graph-based representation can model the detailed visual elements and semantic relations in charts, and further demonstrate how our approach can benefit downstream applications such as semantic-aware chart retrieval and chart question answering.

Chart Question Answering Knowledge Graph Completion +4

PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling

1 code implementation8 Oct 2024 Xudong Xie, Hao Yan, Liang Yin, Yang Liu, Jing Ding, Minghui Liao, Yuliang Liu, Wei Chen, Xiang Bai

In this paper, we introduce PDF-WuKong, a multimodal large language model (MLLM) which is designed to enhance multimodal question-answering (QA) for long PDF documents.

document understanding Language Modeling +4

LoTLIP: Improving Language-Image Pre-training for Long Text Understanding

no code implementations7 Oct 2024 Wei Wu, Kecheng Zheng, Shuailei Ma, Fan Lu, Yuxin Guo, Yifei Zhang, Wei Chen, Qingpei Guo, Yujun Shen, Zheng-Jun Zha

Then, after incorporating corner tokens to aggregate diverse textual information, we manage to help the model catch up to its original level of short text understanding yet greatly enhance its capability of long text understanding.

Image Classification Image Retrieval

MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM Integration

1 code implementation6 Oct 2024 Lai Wei, Wenkai Wang, Xiaoyu Shen, Yu Xie, Zhihao Fan, Xiaojin Zhang, Zhongyu Wei, Wei Chen

In recent advancements, multimodal large language models (MLLMs) have been fine-tuned on specific medical image datasets to address medical visual question answering (Med-VQA) tasks.

Medical Visual Question Answering Question Answering +1

Structural-Entropy-Based Sample Selection for Efficient and Effective Learning

no code implementations3 Oct 2024 Tianchi Xie, Jiangning Zhu, Guozu Ma, Minzhi Lin, Wei Chen, Weikai Yang, Shixia Liu

Based on the decomposition, we present $\textbf{S}$tructural-$\textbf{E}$ntropy-based sample $\textbf{S}$election ($\textbf{SES}$), a method that integrates both global and local information to select informative and representative samples.

Active Learning Continual Learning

GNN-Enabled Optimization of Placement and Transmission Design for UAV Communications

no code implementations3 Oct 2024 Qinyu Wang, Yang Lu, Wei Chen, Bo Ai, Zhangdui Zhong, Dusit Niyato

This paper applies graph neural networks (GNN) in UAV communications to optimize the placement and transmission design.

Model-Based GNN Enabled Energy-Efficient Beamforming for Ultra-Dense Wireless Networks

no code implementations3 Oct 2024 Rongsheng Zhang, Yang Lu, Wei Chen, Bo Ai, Zhiguo Ding

This paper investigates deep learning enabled beamforming design for ultra-dense wireless networks by integrating prior knowledge and graph neural network (GNN), named model-based GNN.

Graph Neural Network

Generative Retrieval Meets Multi-Graded Relevance

no code implementations27 Sep 2024 Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Xueqi Cheng

GR$^2$ focuses on two key components: ensuring relevant and distinct identifiers, and implementing multi-graded constrained contrastive training.

Decoder Information Retrieval +1

FaceVid-1K: A Large-Scale High-Quality Multiracial Human Face Video Dataset

no code implementations23 Sep 2024 Donglin Di, He Feng, Wenzhang Sun, Yongjia Ma, Hao Li, Wei Chen, Xiaofei Gou, Tonghua Su, Xun Yang

We obtain the corresponding performance benchmarks and compared them with those trained on public datasets to demonstrate the superiority of our dataset.

Image Generation Unconditional Video Generation

From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning

no code implementations3 Sep 2024 Wei Chen, Zhen Huang, Liang Xie, Binbin Lin, Houqiang Li, Le Lu, Xinmei Tian, Deng Cai, Yonggang Zhang, Wenxiao Wang, Xu Shen, Jieping Ye

Recent works propose to employ supervised fine-tuning (SFT) to mitigate the sycophancy issue, while it typically leads to the degeneration of LLMs' general capability.

Generic Objects as Pose Probes for Few-shot View Synthesis

no code implementations29 Aug 2024 Zhirui Gao, Renjiao Yi, Chenyang Zhu, Ke Zhuang, Wei Chen, Kai Xu

COLMAP is frequently employed for preprocessing to estimate poses, while it necessitates a large number of feature matches to operate effectively, and it struggles with scenes characterized by sparse features, large baselines between images, or a limited number of input images.

NeRF Novel View Synthesis +1

Do Graph Neural Networks Work for High Entropy Alloys?

1 code implementation29 Aug 2024 Hengrui Zhang, Ruishu Huang, Jie Chen, James M. Rondinelli, Wei Chen

Graph neural networks (GNNs) have excelled in predictive modeling for both crystals and molecules, owing to the expressiveness of graph representations.

Property Prediction

GRPose: Learning Graph Relations for Human Image Generation with Pose Priors

1 code implementation29 Aug 2024 Xiangchen Yin, Donglin Di, Lei Fan, Hao Li, Wei Chen, Xiaofei Gou, Yang song, Xiao Sun, Xun Yang

In this paper, we propose a framework that delves into the graph relations of pose priors to provide control information for human image generation.

Image Generation Pose Estimation

Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models

no code implementations28 Aug 2024 Wei Chen, Zhiyuan Li, Shuo Xin, Yihao Wang

Our work contributes to the development of more sustainable and scalable language models for on-device applications, addressing the critical need for energy-efficient and responsive AI technologies in resource-constrained environments while maintaining the accuracy to understand long contexts.

Decoder

On-Device Language Models: A Comprehensive Review

1 code implementation26 Aug 2024 Jiajun Xu, Zhiyuan Li, Wei Chen, Qun Wang, Xin Gao, Qi Cai, Ziyuan Ling

For a comprehensive review of research work and educational resources on on-device large language models (LLMs), please visit https://github. com/NexaAI/Awesome-LLMs-on-device.

Knowledge Distillation Quantization

Histology Virtual Staining with Mask-Guided Adversarial Transfer Learning for Tertiary Lymphoid Structure Detection

no code implementations26 Aug 2024 Qiuli Wang, Yongxu Liu, Li Ma, Xianqi Wang, Wei Chen, Xiaohong Yao

Capitalizing on the prevalence of H&E staining slides, we introduce a novel Mask-Guided Adversarial Transfer Learning method designed for virtual pathological staining.

Specificity Transfer Learning +2

FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing

1 code implementation22 Aug 2024 Jue Wang, Yuxiang Lin, Tianshuo Yuan, Zhi-Qi Cheng, Xiaolong Wang, Jiao GH, Wei Chen, Xiaojiang Peng

Our approach employs a VLLM in comprehending the image content, mask, and user instructions.

PartGS:Learning Part-aware 3D Representations by Fusing 2D Gaussians and Superquadrics

no code implementations20 Aug 2024 Zhirui Gao, Renjiao Yi, Yuhang Huang, Wei Chen, Chenyang Zhu, Kai Xu

In this paper, we introduce $\textbf{PartGS}$, $\textbf{part}$-aware 3D reconstruction by a hybrid representation of 2D $\textbf{G}$aussians and $\textbf{S}$uperquadrics, which parses objects or scenes into semantic parts, digging 3D structural clues from multi-view image inputs.

3D Reconstruction

Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic Forecasting

1 code implementation20 Aug 2024 Jianxiang Zhou, Erdong Liu, Wei Chen, Siru Zhong, Yuxuan Liang

To tackle these challenges, we introduce the Spatio-Temporal Graph Transformer (STGormer), which effectively integrates attribute and structure information inherent in traffic data for learning spatio-temporal correlations, and a mixture-of-experts module for capturing heterogeneity along spaital and temporal axes.

Attribute

MePT: Multi-Representation Guided Prompt Tuning for Vision-Language Model

no code implementations19 Aug 2024 Xinyang Wang, Yi Yang, Minfeng Zhu, Kecheng Zheng, Shi Liu, Wei Chen

Recent advancements in pre-trained Vision-Language Models (VLMs) have highlighted the significant potential of prompt tuning for adapting these models to a wide range of downstream tasks.

Domain Generalization Language Modeling +1

Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting

no code implementations18 Aug 2024 Zeyuan Chen, Haiyan Wu, Kaixin Wu, Wei Chen, Mingjie Zhong, Jia Xu, Zhongyi Liu, Wei zhang

In response, we propose ProRBP, a novel Progressive Retrieved Behavior-augmented Prompting framework for integrating search scenario-oriented knowledge with LLMs effectively.

Impacts of Darwinian Evolution on Pre-trained Deep Neural Networks

no code implementations10 Aug 2024 Guodong Du, Runhua Jiang, Senqiao Yang, Haoyang Li, Wei Chen, Keren Li, Sim Kuan Goh, Ho-Kin Tang

The empirical results show that the proposed framework has positive impacts on the network, with reduced over-fitting and an order of magnitude lower time complexity compared to BP.

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding

1 code implementation2 Aug 2024 Wei Chen, Long Chen, Yu Wu

In this paper, we propose an efficient and effective multi-task visual grounding (EEVG) framework based on Transformer Decoder to address this issue, which reduces the cost in both language and visual aspects.

Decoder Reasoning Segmentation +1

A Backbone for Long-Horizon Robot Task Understanding

no code implementations2 Aug 2024 Xiaoshuai Chen, Wei Chen, Dongmyoung Lee, Yukun Ge, Nicolas Rojas, Petar Kormushev

In the online testing stage, after a one-shot demonstration of a new task is collected, our MGSF network extracts high-level knowledge, which is then encoded into the image using Action Registration (ActionREG).

Language Modelling Large Language Model

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs

1 code implementation31 Jul 2024 Shi Liu, Kecheng Zheng, Wei Chen

However, the scale disparity between vision encoder and language model may led to LLMs assuming a predominant role in multi-modal comprehension.

Hallucination Image Comprehension +2

Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints

no code implementations26 Jul 2024 Lei Guo, Wei Chen, Yuxuan Sun, Bo Ai, Nikolaos Pappas, Tony Quek

This paper introduces a diffusion-driven semantic communication framework with advanced VAE-based compression for bandwidth-constrained generative model.

Denoising Semantic Communication

Deep learning for predicting the occurrence of tipping points

1 code implementation26 Jul 2024 Chengzuo Zhuge, Jiawei Li, Wei Chen

The ability to predict the occurrence of tipping points from time series data remains an outstanding challenge and a major interest in a broad range of research fields.

Deep Learning Time Series

Theoretical Analysis of Privacy Leakage in Trustworthy Federated Learning: A Perspective from Linear Algebra and Optimization Theory

no code implementations23 Jul 2024 Xiaojin Zhang, Wei Chen

From the optimization theory perspective, we establish an upper bound on the privacy leakage in terms of the batch size, the distortion extent, and several other factors.

Federated Learning Privacy Preserving

Norface: Improving Facial Expression Analysis by Identity Normalization

1 code implementation22 Jul 2024 Hanwei Liu, Rudong An, Zhimeng Zhang, Bowen Ma, Wei zhang, Yan Song, Yujing Hu, Wei Chen, Yu Ding

First, the carefully designed normalization network struggles to directly remove the above task-irrelevant noise, by maintaining facial expression consistency but normalizing all original images to a common identity with consistent pose, and background.

Classification Facial Action Unit Detection +2

Real-Time 3D Occupancy Prediction via Geometric-Semantic Disentanglement

no code implementations18 Jul 2024 Yulin He, Wei Chen, Tianci Xun, Yusong Tan

In the BEV branch, a BEV-level temporal fusion module and a U-Net encoder is introduced to extract dense semantic features.

3D geometry Autonomous Driving +3

Heterogenous Multi-Source Data Fusion Through Input Mapping and Latent Variable Gaussian Process

no code implementations15 Jul 2024 Yigitcan Comlek, Sandipp Krishnan Ravi, Piyush Pandita, Sayan Ghosh, Liping Wang, Wei Chen

In the second stage, a multi-source data fusion model enabled by LVGP is leveraged to build a single source-aware surrogate model on the transformed reference space.

Cantilever Beam Transfer Learning

Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks

1 code implementation13 Jul 2024 Shengbin Yue, Siyuan Wang, Wei Chen, Xuanjing Huang, Zhongyu Wei

Recent advancements in Large Language Models (LLMs) have led to significant breakthroughs in various natural language processing tasks.

Hallucination Navigate

One-Shot Pose-Driving Face Animation Platform

no code implementations12 Jul 2024 He Feng, Donglin Di, Yongjia Ma, Wei Chen, Tonghua Su

The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs.

Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining on Chest CT

no code implementations12 Jul 2024 Jie Zheng, Ru Wen, Haiqin Hu, Lina Wei, Kui Su, Wei Chen, Chen Liu, Jun Wang

Existing Masked Image Modeling (MIM) depends on a spatial patch-based masking-reconstruction strategy to perceive objects'features from unlabeled images, which may face two limitations when applied to chest CT: 1) inefficient feature learning due to complex anatomical details presented in CT images, and 2) suboptimal knowledge transfer owing to input disparity between upstream and downstream models.

Contrastive Learning Self-Supervised Learning +1

Entropy-Informed Weighting Channel Normalizing Flow

1 code implementation6 Jul 2024 Wei Chen, Shian Du, Shigui Li, Delu Zeng, John Paisley

Normalizing Flows (NFs) have gained popularity among deep generative models due to their ability to provide exact likelihood estimation and efficient sampling.

Density Estimation

A Unified Learn-to-Distort-Data Framework for Privacy-Utility Trade-off in Trustworthy Federated Learning

no code implementations5 Jul 2024 Xiaojin Zhang, Mingcong Xu, Wei Chen

In this paper, we first give an introduction to the theoretical basis of the privacy-utility equilibrium in federated learning based on Bayesian privacy definitions and total variation distance privacy definitions.

Federated Learning Navigate +1

AI-Driven Mobility Management for High-Speed Railway Communications: Compressed Measurements and Proactive Handover

no code implementations5 Jul 2024 Wen Li, Wei Chen, Shiyue Wang, Yuanyuan Zhang, Michail Matthaiou, Bo Ai

Compared with the traditional event A3-based handover mechanism, the proposed approach significantly reduces the RLF rates which saves 50% beam measurement overhead.

Beam Prediction Compressive Sensing +1

CURLS: Causal Rule Learning for Subgroups with Significant Treatment Effect

no code implementations1 Jul 2024 Jiehui Zhou, Linxiao Yang, Xingyu Liu, Xinyue Gu, Liang Sun, Wei Chen

In this paper, we propose CURLS, a novel rule learning method leveraging HTE, which can effectively describe subgroups with significant treatment effects.

Causal Inference Management

Octo-planner: On-device Language Model for Planner-Action Agents

no code implementations26 Jun 2024 Wei Chen, Zhiyuan Li, Zhen Guo, Yikang Shen

In this paper, we present an efficient on-device Planner-Action framework that separates planning and action execution into two distinct components: a planner agent based on Phi-3 Mini, a 3. 8 billion parameter LLM optimized for edge devices, and an action agent using the Octopus model for function execution.

Computational Efficiency In-Context Learning +2

Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective

1 code implementation20 Jun 2024 Yuchen Wen, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng

As Large Language Models (LLMs) become an important way of information seeking, there have been increasing concerns about the unethical content LLMs may generate.

Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework

no code implementations17 Jun 2024 Siyuan Yu, Wei Chen, H. Vincent Poor

It is interestingly shown that increasing the number of activated workers does not necessarily accelerate distributed SGD due to staleness.

Scheduling

CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation

1 code implementation15 Jun 2024 Wei Chen, Lin Li, Yongqi Yang, Bin Wen, Fan Yang, Tingting Gao, Yu Wu, Long Chen

To address this gap, we introduce CoMM, a high-quality Coherent interleaved image-text MultiModal dataset designed to enhance the coherence, consistency, and alignment of generated multimodal content.

In-Context Learning Visual Storytelling

MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents

no code implementations12 Jun 2024 Luyuan Wang, Yongyu Deng, Yiwei Zha, Guodong Mao, Qinmin Wang, Tianchen Min, Wei Chen, Shoufa Chen

Large language model (LLM)-based mobile agents are increasingly popular due to their capability to interact directly with mobile phone Graphic User Interfaces (GUIs) and their potential to autonomously manage daily tasks.

Benchmarking Language Modeling +2

CLDTA: Contrastive Learning based on Diagonal Transformer Autoencoder for Cross-Dataset EEG Emotion Recognition

no code implementations12 Jun 2024 Yuan Liao, Yuhong Zhang, Shenghuan Wang, Xiruo Zhang, Yiling Zhang, Wei Chen, Yuzhe Gu, Liya Huang

Recent advances in non-invasive EEG technology have broadened its application in emotion recognition, yielding a multitude of related datasets.

Contrastive Learning EEG +1

VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models

1 code implementation11 Jun 2024 Yu Liu, Lang Gao, Mingxin Yang, Yu Xie, Ping Chen, Xiaojin Zhang, Wei Chen

However, sound comprehensive research on detecting program vulnerabilities, a more specific task related to code, and evaluating the performance of LLMs in this more specialized scenario is still lacking.

Vulnerability Detection

Global Parameterization-based Texture Space Optimization

no code implementations6 Jun 2024 Wei Chen, Yuxue Ren, Na lei, Zhongxuan Luo, Xianfeng GU

Experiments show the effectiveness of the proposed method and the potency in improving the storage and rendering efficiency.

A Combination Model for Time Series Prediction using LSTM via Extracting Dynamic Features Based on Spatial Smoothing and Sequential General Variational Mode Decomposition

no code implementations5 Jun 2024 Jianyu Liu, Wei Chen, Yong Zhang, Zhenfeng Chen, Bin Wan, Jinwei Hu

In order to solve the problems such as difficult to extract effective features and low accuracy of sales volume prediction caused by complex relationships such as market sales volume in time series prediction, we proposed a time series prediction method of market sales volume based on Sequential General VMD and spatial smoothing Long short-term memory neural network (SS-LSTM) combination model.

Prediction Time Series +1

A Combination Model Based on Sequential General Variational Mode Decomposition Method for Time Series Prediction

no code implementations5 Jun 2024 Wei Chen, Yuanyuan Yang, Jianyu Liu

Within the prediction interval, our proposed combination model has improved advantages over traditional decomposition prediction control group models.

Prediction Time Series +1

Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond

no code implementations3 Jun 2024 Xutong Liu, Siwei Wang, Jinhang Zuo, Han Zhong, Xuchuang Wang, Zhiyong Wang, Shuai Li, Mohammad Hajiesmaili, John C. S. Lui, Wei Chen

We introduce a novel framework of combinatorial multi-armed bandits (CMAB) with multivariant and probabilistically triggering arms (CMAB-MT), where the outcome of each arm is a $d$-dimensional multivariant random variable and the feedback follows a general arm triggering process.

Multi-Armed Bandits Reinforcement Learning (RL)

No Free Lunch Theorem for Privacy-Preserving LLM Inference

no code implementations31 May 2024 Xiaojin Zhang, Yahao Pang, Yan Kang, Wei Chen, Lixin Fan, Hai Jin, Qiang Yang

Therefore, it is essential to evaluate the balance between the risk of privacy leakage and loss of utility when conducting effective protection mechanisms.

Privacy Preserving

Identifying Functional Brain Networks of Spatiotemporal Wide-Field Calcium Imaging Data via a Long Short-Term Memory Autoencoder

no code implementations30 May 2024 Xiaohui Zhang, Eric C Landsness, Lindsey M Brier, Wei Chen, Michelle J. Tang, Hanyang Miao, Jin-Moo Lee, Mark A. Anastasio, Joseph P. Culver

The goal of this study is to elucidate and illustrate, qualitatively and quantitatively, the FBNs identified by use of the LSTM-AER method and compare them to those from traditional SBC and ICA.

Representation Learning

Fully Exploiting Every Real Sample: SuperPixel Sample Gradient Model Stealing

1 code implementation CVPR 2024 Yunlong Zhao, Xiaoheng Deng, Yijing Liu, Xinjun Pei, Jiazhi Xia, Wei Chen

With the basic idea of imitating the victim model's low-variance patch-level gradients instead of pixel-level gradients, SPSG achieves efficient sample gradient estimation through two steps.

VideoQA-SC: Adaptive Semantic Communication for Video Question Answering

no code implementations17 May 2024 Jiangyuan Guo, Wei Chen, Yuxuan Sun, Jialong Xu, Bo Ai

The difficulty in such system design lies in the extraction of task-related compact semantic representations and their accurate delivery over noisy channels.

Question Answering Semantic Communication +2

Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization

no code implementations17 May 2024 Yixin Ji, Yang Xiang, Juntao Li, Wei Chen, Zhongyi Liu, Kehai Chen, Min Zhang

To address the challenges of low-rank compression in LLMs, we conduct empirical research on the low-rank characteristics of large models.

Bayesian Optimization Low-rank compression

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models

no code implementations15 May 2024 Siwei Wang, Yifei Shen, Shi Feng, Haoran Sun, Shang-Hua Teng, Wei Chen

Furthermore, our theoretical analysis of gradient-based learning dynamics reveals that LLMs can learn both the adjacency and a limited form of the reachability matrices.

Deep Learning for CSI Feedback: One-Sided Model and Joint Multi-Module Learning Perspectives

no code implementations9 May 2024 Yiran Guo, Wei Chen, Feifei Sun, Jiaming Cheng, Michail Matthaiou, Bo Ai

The use of deep learning (DL) for channel state information (CSI) feedback has garnered widespread attention across academia and industry.

ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion

no code implementations8 May 2024 Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, Wei Xiang

However, mmWave radar relies on the collection of reflected signals from the target, and the radar signals containing information is difficult to be fully applied.

Pose Estimation

Octopus v4: Graph of language models

no code implementations30 Apr 2024 Wei Chen, Zhiyuan Li

Additionally, we explore the use of graph as a versatile data structure that effectively coordinates multiple open-source models by harnessing the capabilities of the Octopus model and \textit{functional tokens}.

MMLU

ControlTraj: Controllable Trajectory Generation with Topology-Constrained Diffusion Model

no code implementations23 Apr 2024 Yuanshao Zhu, James Jianqiao Yu, Xiangyu Zhao, Qidong Liu, Yongchao Ye, Wei Chen, Zijian Zhang, Xuetao Wei, Yuxuan Liang

Generating trajectory data is among promising solutions to addressing privacy concerns, collection costs, and proprietary restrictions usually associated with human mobility analyses.

Denoising Diversity

decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points

1 code implementation19 Apr 2024 Yi Guo, Fanliu Kong, Xiaoyang Li, Hui Li, Wei Chen, Xiaogang Tian, Jinping Cai, Yang Zhang, Shouda Liu

However, existing quantization schemes suffer from significant accuracy degradation at very low bits, or require some additional computational overhead when deployed, making it difficult to be applied to large-scale applications in industry.

Quantization

Adaptive Catalyst Discovery Using Multicriteria Bayesian Optimization with Representation Learning

1 code implementation18 Apr 2024 Jie Chen, Pengfei Ou, Yuxin Chang, Hengrui Zhang, Xiao-Yan Li, Edward H. Sargent, Wei Chen

The results demonstrate that our approach achieves high prediction accuracy, facilitates interpretable feature extraction, and enables multicriteria design optimization, leading to significant reduction of computing power and time (10x reduction of required DFT calculations) in high-performance catalyst discovery.

Bayesian Optimization Representation Learning +1

Graph Neural Networks for Wireless Networks: Graph Representation, Architecture and Evaluation

no code implementations18 Apr 2024 Yang Lu, Yuhang Li, Ruichen Zhang, Wei Chen, Bo Ai, Dusit Niyato

Graph neural networks (GNNs) have been regarded as the basic model to facilitate deep learning (DL) to revolutionize resource allocation in wireless networks.

Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent

no code implementations17 Apr 2024 Wei Chen, Zhiyuan Li

A multimodal AI agent is characterized by its ability to process and learn from various types of data, including natural language, visual, and audio inputs, to inform its actions.

AI Agent

Interpolating neural network: A novel unification of machine learning and interpolation theory

1 code implementation16 Apr 2024 Chanwook Park, Sourav Saha, Jiachen Guo, Hantao Zhang, Xiaoyu Xie, Miguel A. Bessa, Dong Qian, Wei Chen, Gregory J. Wagner, Jian Cao, Wing Kam Liu

Artificial intelligence (AI) has revolutionized software development, shifting from task-specific codes (Software 1. 0) to neural network-based approaches (Software 2. 0).

Physical Simulations Tensor Decomposition

Building Semantic Communication System via Molecules: An End-to-End Training Approach

no code implementations15 Apr 2024 Yukun Cheng, Wei Chen, Bo Ai

Furthermore, we propose a channel network to enable the E2E learning over the non-differentiable molecular channel.

Semantic Communication

Bridging Data Islands: Geographic Heterogeneity-Aware Federated Learning for Collaborative Remote Sensing Semantic Segmentation

no code implementations14 Apr 2024 Jieyi Tan, Yansheng Li, Sergey A. Bartalev, Shinkarenko Stanislav, Bo Dang, Yongjun Zhang, Liangqi Yuan, Wei Chen

Our framework consists of three modules, including the Global Insight Enhancement (GIE) module, the Essential Feature Mining (EFM) module and the Local-Global Balance (LoGo) module.

Earth Observation Federated Learning +2

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models

no code implementations12 Apr 2024 Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Minfeng Zhu, Wei zhang, Wei Chen

Addressing these concerns necessitates a comprehensive analysis of jailbreak prompts to evaluate LLMs' defensive capabilities and identify potential weaknesses.

Low-rank Adaptation for Spatio-Temporal Forecasting

1 code implementation11 Apr 2024 Weilin Ruan, Wei Chen, Xilin Dang, Jianxiang Zhou, Weichuang Li, Xu Liu, Yuxuan Liang

Spatio-temporal forecasting is crucial in real-world dynamic systems, predicting future changes using historical data from diverse locations.

Prediction Spatio-Temporal Forecasting

Octopus: On-device language model for function calling of software APIs

no code implementations2 Apr 2024 Wei Chen, Zhiyuan Li, Mingyuan Ma

In the rapidly evolving domain of artificial intelligence, Large Language Models (LLMs) play a crucial role due to their advanced text processing and generation abilities.

Language Modeling Language Modelling

Octopus v2: On-device language model for super agent

no code implementations2 Apr 2024 Wei Chen, Zhiyuan Li

Current on-device models for function calling face issues with latency and accuracy.

Language Modeling Language Modelling +1

A Correction of Pseudo Log-Likelihood Method

no code implementations26 Mar 2024 Shi Feng, Nuoya Xiong, Zhijie Zhang, Wei Chen

Pseudo log-likelihood is a type of maximum likelihood estimation (MLE) method used in various fields including contextual bandits, influence maximization of social networks, and causal bandits.

Multi-Armed Bandits

Invisible Gas Detection: An RGB-Thermal Cross Attention Network and A New Benchmark

1 code implementation26 Mar 2024 Jue Wang, Yuxiang Lin, Qi Zhao, Dong Luo, Shuaibao Chen, Wei Chen, Xiaojiang Peng

The widespread use of various chemical gases in industrial processes necessitates effective measures to prevent their leakage during transportation and storage, given their high toxicity.

UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Socioeconomic Indicator Prediction

2 code implementations25 Mar 2024 Xixuan Hao, Wei Chen, Yibo Yan, Siru Zhong, Kun Wang, Qingsong Wen, Yuxuan Liang

Our UrbanVLP seamlessly integrates multi-granularity information from both macro (satellite) and micro (street-view) levels, overcoming the limitations of prior pretrained models.

Hallucination Text Generation

DreamLIP: Language-Image Pre-training with Long Captions

1 code implementation25 Mar 2024 Kecheng Zheng, Yifei Zhang, Wei Wu, Fan Lu, Shuailei Ma, Xin Jin, Wei Chen, Yujun Shen

Motivated by this, we propose to dynamically sample sub-captions from the text label to construct multiple positive pairs, and introduce a grouping loss to match the embeddings of each sub-caption with its corresponding local image patches in a self-supervised manner.

Contrastive Learning Image-text Retrieval +5

Deciphering the Interplay between Local Differential Privacy, Average Bayesian Privacy, and Maximum Bayesian Privacy

no code implementations25 Mar 2024 Xiaojin Zhang, Yulin Fei, Wei Chen

The swift evolution of machine learning has led to emergence of various definitions of privacy due to the threats it poses to privacy, including the concept of local differential privacy (LDP).

Privacy Preserving

Heterogeneous Federated Learning with Splited Language Model

no code implementations24 Mar 2024 Yifan Shi, Yuhui Zhang, Ziyue Huang, Xiaofeng Yang, Li Shen, Wei Chen, Xueqian Wang

Federated Split Learning (FSL) is a promising distributed learning paradigm in practice, which gathers the strengths of both Federated Learning (FL) and Split Learning (SL) paradigms, to ensure model privacy while diminishing the resource overhead of each client, especially on large transformer models in a resource-constrained environment, e. g., Internet of Things (IoT).

Federated Learning Language Modeling +2

On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond

no code implementations22 Mar 2024 Bohan Wang, Huishuai Zhang, Qi Meng, Ruoyu Sun, Zhi-Ming Ma, Wei Chen

This paper aims to clearly distinguish between Stochastic Gradient Descent with Momentum (SGDM) and Adam in terms of their convergence rates.

Listwise Generative Retrieval Models via a Sequential Learning Process

1 code implementation19 Mar 2024 Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Xueqi Cheng

Specifically, we view the generation of a ranked docid list as a sequence learning process: at each step we learn a subset of parameters that maximizes the corresponding generation likelihood of the $i$-th docid given the (preceding) top $i-1$ docids.

Retrieval

Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph

1 code implementation14 Mar 2024 Donglin Di, Jiahui Yang, Chaofan Luo, Zhou Xue, Wei Chen, Xun Yang, Yue Gao

Our framework is anchored by a well-established mainflow and an essential module, named ``Geometry and Texture Hypergraph Refiner (HGRefiner)''.

3D Generation 3DGS +1

CarbonNet: How Computer Vision Plays a Role in Climate Change? Application: Learning Geomechanics from Subsurface Geometry of CCS to Mitigate Global Warming

no code implementations9 Mar 2024 Wei Chen, Yunan Li, Yuan Tian

We introduce a new approach using computer vision to predict the land surface displacement from subsurface geometry images for Carbon Capture and Sequestration (CCS).

Decision Making Video Prediction

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

1 code implementation8 Mar 2024 Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love, Paul Voigtlaender, Rohan Jain, Gabriela Surita, Kareem Mohamed, Rory Blevins, Junwhan Ahn, Tao Zhu, Kornraphop Kawintiranon, Orhan Firat, Yiming Gu, Yujing Zhang, Matthew Rahtz, Manaal Faruqui, Natalie Clay, Justin Gilmer, JD Co-Reyes, Ivo Penchev, Rui Zhu, Nobuyuki Morioka, Kevin Hui, Krishna Haridasan, Victor Campos, Mahdis Mahdieh, Mandy Guo, Samer Hassan, Kevin Kilgour, Arpi Vezer, Heng-Tze Cheng, Raoul de Liedekerke, Siddharth Goyal, Paul Barham, DJ Strouse, Seb Noury, Jonas Adler, Mukund Sundararajan, Sharad Vikram, Dmitry Lepikhin, Michela Paganini, Xavier Garcia, Fan Yang, Dasha Valter, Maja Trebacz, Kiran Vodrahalli, Chulayuth Asawaroengchai, Roman Ring, Norbert Kalb, Livio Baldini Soares, Siddhartha Brahma, David Steiner, Tianhe Yu, Fabian Mentzer, Antoine He, Lucas Gonzalez, Bibo Xu, Raphael Lopez Kaufman, Laurent El Shafey, Junhyuk Oh, Tom Hennigan, George van den Driessche, Seth Odoom, Mario Lucic, Becca Roelofs, Sid Lall, Amit Marathe, Betty Chan, Santiago Ontanon, Luheng He, Denis Teplyashin, Jonathan Lai, Phil Crone, Bogdan Damoc, Lewis Ho, Sebastian Riedel, Karel Lenc, Chih-Kuan Yeh, Aakanksha Chowdhery, Yang Xu, Mehran Kazemi, Ehsan Amid, Anastasia Petrushkina, Kevin Swersky, Ali Khodaei, Gowoon Chen, Chris Larkin, Mario Pinto, Geng Yan, Adria Puigdomenech Badia, Piyush Patil, Steven Hansen, Dave Orr, Sebastien M. R. Arnold, Jordan Grimstad, Andrew Dai, Sholto Douglas, Rishika Sinha, Vikas Yadav, Xi Chen, Elena Gribovskaya, Jacob Austin, Jeffrey Zhao, Kaushal Patel, Paul Komarek, Sophia Austin, Sebastian Borgeaud, Linda Friso, Abhimanyu Goyal, Ben Caine, Kris Cao, Da-Woon Chung, Matthew Lamm, Gabe Barth-Maron, Thais Kagohara, Kate Olszewska, Mia Chen, Kaushik Shivakumar, Rishabh Agarwal, Harshal Godhia, Ravi Rajwar, Javier Snaider, Xerxes Dotiwalla, YuAn Liu, Aditya Barua, Victor Ungureanu, Yuan Zhang, Bat-Orgil Batsaikhan, Mateo Wirth, James Qin, Ivo Danihelka, Tulsee Doshi, Martin Chadwick, Jilin Chen, Sanil Jain, Quoc Le, Arjun Kar, Madhu Gurumurthy, Cheng Li, Ruoxin Sang, Fangyu Liu, Lampros Lamprou, Rich Munoz, Nathan Lintz, Harsh Mehta, Heidi Howard, Malcolm Reynolds, Lora Aroyo, Quan Wang, Lorenzo Blanco, Albin Cassirer, Jordan Griffith, Dipanjan Das, Stephan Lee, Jakub Sygnowski, Zach Fisher, James Besley, Richard Powell, Zafarali Ahmed, Dominik Paulus, David Reitter, Zalan Borsos, Rishabh Joshi, Aedan Pope, Steven Hand, Vittorio Selo, Vihan Jain, Nikhil Sethi, Megha Goel, Takaki Makino, Rhys May, Zhen Yang, Johan Schalkwyk, Christina Butterfield, Anja Hauth, Alex Goldin, Will Hawkins, Evan Senter, Sergey Brin, Oliver Woodman, Marvin Ritter, Eric Noland, Minh Giang, Vijay Bolina, Lisa Lee, Tim Blyth, Ian Mackinnon, Machel Reid, Obaid Sarvana, David Silver, Alexander Chen, Lily Wang, Loren Maggiore, Oscar Chang, Nithya Attaluri, Gregory Thornton, Chung-Cheng Chiu, Oskar Bunyan, Nir Levine, Timothy Chung, Evgenii Eltyshev, Xiance Si, Timothy Lillicrap, Demetra Brady, Vaibhav Aggarwal, Boxi Wu, Yuanzhong Xu, Ross Mcilroy, Kartikeya Badola, Paramjit Sandhu, Erica Moreira, Wojciech Stokowiec, Ross Hemsley, Dong Li, Alex Tudor, Pranav Shyam, Elahe Rahimtoroghi, Salem Haykal, Pablo Sprechmann, Xiang Zhou, Diana Mincu, Yujia Li, Ravi Addanki, Kalpesh Krishna, Xiao Wu, Alexandre Frechette, Matan Eyal, Allan Dafoe, Dave Lacey, Jay Whang, Thi Avrahami, Ye Zhang, Emanuel Taropa, Hanzhao Lin, Daniel Toyama, Eliza Rutherford, Motoki Sano, HyunJeong Choe, Alex Tomala, Chalence Safranek-Shrader, Nora Kassner, Mantas Pajarskas, Matt Harvey, Sean Sechrist, Meire Fortunato, Christina Lyu, Gamaleldin Elsayed, Chenkai Kuang, James Lottes, Eric Chu, Chao Jia, Chih-Wei Chen, Peter Humphreys, Kate Baumli, Connie Tao, Rajkumar Samuel, Cicero Nogueira dos santos, Anders Andreassen, Nemanja Rakićević, Dominik Grewe, Aviral Kumar, Stephanie Winkler, Jonathan Caton, Andrew Brock, Sid Dalmia, Hannah Sheahan, Iain Barr, Yingjie Miao, Paul Natsev, Jacob Devlin, Feryal Behbahani, Flavien Prost, Yanhua Sun, Artiom Myaskovsky, Thanumalayan Sankaranarayana Pillai, Dan Hurt, Angeliki Lazaridou, Xi Xiong, Ce Zheng, Fabio Pardo, Dan Horgan, Joe Stanton, Moran Ambar, Fei Xia, Alejandro Lince, Mingqiu Wang, Basil Mustafa, Albert Webson, Hyo Lee, Rohan Anil, Martin Wicke, Timothy Dozat, Abhishek Sinha, Enrique Piqueras, Elahe Dabir, Shyam Upadhyay, Anudhyan Boral, Lisa Anne Hendricks, Corey Fry, Josip Djolonga, Yi Su, Jake Walker, Jane Labanowski, Ronny Huang, Vedant Misra, Jeremy Chen, RJ Skerry-Ryan, Avi Singh, Shruti Rijhwani, Dian Yu, Alex Castro-Ros, Beer Changpinyo, Romina Datta, Sumit Bagri, Arnar Mar Hrafnkelsson, Marcello Maggioni, Daniel Zheng, Yury Sulsky, Shaobo Hou, Tom Le Paine, Antoine Yang, Jason Riesa, Dominika Rogozinska, Dror Marcus, Dalia El Badawy, Qiao Zhang, Luyu Wang, Helen Miller, Jeremy Greer, Lars Lowe Sjos, Azade Nova, Heiga Zen, Rahma Chaabouni, Mihaela Rosca, Jiepu Jiang, Charlie Chen, Ruibo Liu, Tara Sainath, Maxim Krikun, Alex Polozov, Jean-Baptiste Lespiau, Josh Newlan, Zeyncep Cankara, Soo Kwak, Yunhan Xu, Phil Chen, Andy Coenen, Clemens Meyer, Katerina Tsihlas, Ada Ma, Juraj Gottweis, Jinwei Xing, Chenjie Gu, Jin Miao, Christian Frank, Zeynep Cankara, Sanjay Ganapathy, Ishita Dasgupta, Steph Hughes-Fitt, Heng Chen, David Reid, Keran Rong, Hongmin Fan, Joost van Amersfoort, Vincent Zhuang, Aaron Cohen, Shixiang Shane Gu, Anhad Mohananey, Anastasija Ilic, Taylor Tobin, John Wieting, Anna Bortsova, Phoebe Thacker, Emma Wang, Emily Caveness, Justin Chiu, Eren Sezener, Alex Kaskasoli, Steven Baker, Katie Millican, Mohamed Elhawaty, Kostas Aisopos, Carl Lebsack, Nathan Byrd, Hanjun Dai, Wenhao Jia, Matthew Wiethoff, Elnaz Davoodi, Albert Weston, Lakshman Yagati, Arun Ahuja, Isabel Gao, Golan Pundak, Susan Zhang, Michael Azzam, Khe Chai Sim, Sergi Caelles, James Keeling, Abhanshu Sharma, Andy Swing, Yaguang Li, Chenxi Liu, Carrie Grimes Bostock, Yamini Bansal, Zachary Nado, Ankesh Anand, Josh Lipschultz, Abhijit Karmarkar, Lev Proleev, Abe Ittycheriah, Soheil Hassas Yeganeh, George Polovets, Aleksandra Faust, Jiao Sun, Alban Rrustemi, Pen Li, Rakesh Shivanna, Jeremiah Liu, Chris Welty, Federico Lebron, Anirudh Baddepudi, Sebastian Krause, Emilio Parisotto, Radu Soricut, Zheng Xu, Dawn Bloxwich, Melvin Johnson, Behnam Neyshabur, Justin Mao-Jones, Renshen Wang, Vinay Ramasesh, Zaheer Abbas, Arthur Guez, Constant Segal, Duc Dung Nguyen, James Svensson, Le Hou, Sarah York, Kieran Milan, Sophie Bridgers, Wiktor Gworek, Marco Tagliasacchi, James Lee-Thorp, Michael Chang, Alexey Guseynov, Ale Jakse Hartman, Michael Kwong, Ruizhe Zhao, Sheleem Kashem, Elizabeth Cole, Antoine Miech, Richard Tanburn, Mary Phuong, Filip Pavetic, Sebastien Cevey, Ramona Comanescu, Richard Ives, Sherry Yang, Cosmo Du, Bo Li, Zizhao Zhang, Mariko Iinuma, Clara Huiyi Hu, Aurko Roy, Shaan Bijwadia, Zhenkai Zhu, Danilo Martins, Rachel Saputro, Anita Gergely, Steven Zheng, Dawei Jia, Ioannis Antonoglou, Adam Sadovsky, Shane Gu, Yingying Bi, Alek Andreev, Sina Samangooei, Mina Khan, Tomas Kocisky, Angelos Filos, Chintu Kumar, Colton Bishop, Adams Yu, Sarah Hodkinson, Sid Mittal, Premal Shah, Alexandre Moufarek, Yong Cheng, Adam Bloniarz, Jaehoon Lee, Pedram Pejman, Paul Michel, Stephen Spencer, Vladimir Feinberg, Xuehan Xiong, Nikolay Savinov, Charlotte Smith, Siamak Shakeri, Dustin Tran, Mary Chesus, Bernd Bohnet, George Tucker, Tamara von Glehn, Carrie Muir, Yiran Mao, Hideto Kazawa, Ambrose Slone, Kedar Soparkar, Disha Shrivastava, James Cobon-Kerr, Michael Sharman, Jay Pavagadhi, Carlos Araya, Karolis Misiunas, Nimesh Ghelani, Michael Laskin, David Barker, Qiujia Li, Anton Briukhov, Neil Houlsby, Mia Glaese, Balaji Lakshminarayanan, Nathan Schucher, Yunhao Tang, Eli Collins, Hyeontaek Lim, Fangxiaoyu Feng, Adria Recasens, Guangda Lai, Alberto Magni, Nicola De Cao, Aditya Siddhant, Zoe Ashwood, Jordi Orbay, Mostafa Dehghani, Jenny Brennan, Yifan He, Kelvin Xu, Yang Gao, Carl Saroufim, James Molloy, Xinyi Wu, Seb Arnold, Solomon Chang, Julian Schrittwieser, Elena Buchatskaya, Soroush Radpour, Martin Polacek, Skye Giordano, Ankur Bapna, Simon Tokumine, Vincent Hellendoorn, Thibault Sottiaux, Sarah Cogan, Aliaksei Severyn, Mohammad Saleh, Shantanu Thakoor, Laurent Shefey, Siyuan Qiao, Meenu Gaba, Shuo-Yiin Chang, Craig Swanson, Biao Zhang, Benjamin Lee, Paul Kishan Rubenstein, Gan Song, Tom Kwiatkowski, Anna Koop, Ajay Kannan, David Kao, Parker Schuh, Axel Stjerngren, Golnaz Ghiasi, Gena Gibson, Luke Vilnis, Ye Yuan, Felipe Tiengo Ferreira, Aishwarya Kamath, Ted Klimenko, Ken Franko, Kefan Xiao, Indro Bhattacharya, Miteyan Patel, Rui Wang, Alex Morris, Robin Strudel, Vivek Sharma, Peter Choy, Sayed Hadi Hashemi, Jessica Landon, Mara Finkelstein, Priya Jhakra, Justin Frye, Megan Barnes, Matthew Mauger, Dennis Daun, Khuslen Baatarsukh, Matthew Tung, Wael Farhan, Henryk Michalewski, Fabio Viola, Felix de Chaumont Quitry, Charline Le Lan, Tom Hudson, Qingze Wang, Felix Fischer, Ivy Zheng, Elspeth White, Anca Dragan, Jean-Baptiste Alayrac, Eric Ni, Alexander Pritzel, Adam Iwanicki, Michael Isard, Anna Bulanova, Lukas Zilka, Ethan Dyer, Devendra Sachan, Srivatsan Srinivasan, Hannah Muckenhirn, Honglong Cai, Amol Mandhane, Mukarram Tariq, Jack W. Rae, Gary Wang, Kareem Ayoub, Nicholas FitzGerald, Yao Zhao, Woohyun Han, Chris Alberti, Dan Garrette, Kashyap Krishnakumar, Mai Gimenez, Anselm Levskaya, Daniel Sohn, Josip Matak, Inaki Iturrate, Michael B. Chang, Jackie Xiang, Yuan Cao, Nishant Ranka, Geoff Brown, Adrian Hutter, Nanxin Chen, Kaisheng Yao, Zoltan Egyed, Francois Galilee, Tyler Liechty, Praveen Kallakuri, Evan Palmer, Sanjay Ghemawat, Jasmine Liu, David Tao, Chloe Thornton, Tim Green, Mimi Jasarevic, Sharon Lin, Victor Cotruta, Yi-Xuan Tan, Noah Fiedel, Hongkun Yu, Ed Chi, Alexander Neitz, Jens Heitkaemper, Anu Sinha, Denny Zhou, Yi Sun, Charbel Kaed, Brice Hulse, Swaroop Mishra, Maria Georgaki, Sneha Kudugunta, Clement Farabet, Izhak Shafran, Daniel Vlasic, Anton Tsitsulin, Rajagopal Ananthanarayanan, Alen Carin, Guolong Su, Pei Sun, Shashank V, Gabriel Carvajal, Josef Broder, Iulia Comsa, Alena Repina, William Wong, Warren Weilun Chen, Peter Hawkins, Egor Filonov, Lucia Loher, Christoph Hirnschall, Weiyi Wang, Jingchen Ye, Andrea Burns, Hardie Cate, Diana Gage Wright, Federico Piccinini, Lei Zhang, Chu-Cheng Lin, Ionel Gog, Yana Kulizhskaya, Ashwin Sreevatsa, Shuang Song, Luis C. Cobo, Anand Iyer, Chetan Tekur, Guillermo Garrido, Zhuyun Xiao, Rupert Kemp, Huaixiu Steven Zheng, Hui Li, Ananth Agarwal, Christel Ngani, Kati Goshvadi, Rebeca Santamaria-Fernandez, Wojciech Fica, Xinyun Chen, Chris Gorgolewski, Sean Sun, Roopal Garg, Xinyu Ye, S. M. Ali Eslami, Nan Hua, Jon Simon, Pratik Joshi, Yelin Kim, Ian Tenney, Sahitya Potluri, Lam Nguyen Thiet, Quan Yuan, Florian Luisier, Alexandra Chronopoulou, Salvatore Scellato, Praveen Srinivasan, Minmin Chen, Vinod Koverkathu, Valentin Dalibard, Yaming Xu, Brennan Saeta, Keith Anderson, Thibault Sellam, Nick Fernando, Fantine Huot, Junehyuk Jung, Mani Varadarajan, MICHAEL QUINN, Amit Raul, Maigo Le, Ruslan Habalov, Jon Clark, Komal Jalan, Kalesha Bullard, Achintya Singhal, Thang Luong, Boyu Wang, Sujeevan Rajayogam, Julian Eisenschlos, Johnson Jia, Daniel Finchelstein, Alex Yakubovich, Daniel Balle, Michael Fink, Sameer Agarwal, Jing Li, DJ Dvijotham, Shalini Pal, Kai Kang, Jaclyn Konzelmann, Jennifer Beattie, Olivier Dousse, Diane Wu, Remi Crocker, Chen Elkind, Siddhartha Reddy Jonnalagadda, Jong Lee, Dan Holtmann-Rice, Krystal Kallarackal, Rosanne Liu, Denis Vnukov, Neera Vats, Luca Invernizzi, Mohsen Jafari, Huanjie Zhou, Lilly Taylor, Jennifer Prendki, Marcus Wu, Tom Eccles, Tianqi Liu, Kavya Kopparapu, Francoise Beaufays, Christof Angermueller, Andreea Marzoca, Shourya Sarcar, Hilal Dib, Jeff Stanway, Frank Perbet, Nejc Trdin, Rachel Sterneck, Andrey Khorlin, Dinghua Li, Xihui Wu, Sonam Goenka, David Madras, Sasha Goldshtein, Willi Gierke, Tong Zhou, Yaxin Liu, Yannie Liang, Anais White, Yunjie Li, Shreya Singh, Sanaz Bahargam, Mark Epstein, Sujoy Basu, Li Lao, Adnan Ozturel, Carl Crous, Alex Zhai, Han Lu, Zora Tung, Neeraj Gaur, Alanna Walton, Lucas Dixon, Ming Zhang, Amir Globerson, Grant Uy, Andrew Bolt, Olivia Wiles, Milad Nasr, Ilia Shumailov, Marco Selvi, Francesco Piccinno, Ricardo Aguilar, Sara McCarthy, Misha Khalman, Mrinal Shukla, Vlado Galic, John Carpenter, Kevin Villela, Haibin Zhang, Harry Richardson, James Martens, Matko Bosnjak, Shreyas Rammohan Belle, Jeff Seibert, Mahmoud Alnahlawi, Brian McWilliams, Sankalp Singh, Annie Louis, Wen Ding, Dan Popovici, Lenin Simicich, Laura Knight, Pulkit Mehta, Nishesh Gupta, Chongyang Shi, Saaber Fatehi, Jovana Mitrovic, Alex Grills, Joseph Pagadora, Tsendsuren Munkhdalai, Dessie Petrova, Danielle Eisenbud, Zhishuai Zhang, Damion Yates, Bhavishya Mittal, Nilesh Tripuraneni, Yannis Assael, Thomas Brovelli, Prateek Jain, Mihajlo Velimirovic, Canfer Akbulut, Jiaqi Mu, Wolfgang Macherey, Ravin Kumar, Jun Xu, Haroon Qureshi, Gheorghe Comanici, Jeremy Wiesner, Zhitao Gong, Anton Ruddock, Matthias Bauer, Nick Felt, Anirudh GP, Anurag Arnab, Dustin Zelle, Jonas Rothfuss, Bill Rosgen, Ashish Shenoy, Bryan Seybold, Xinjian Li, Jayaram Mudigonda, Goker Erdogan, Jiawei Xia, Jiri Simsa, Andrea Michi, Yi Yao, Christopher Yew, Steven Kan, Isaac Caswell, Carey Radebaugh, Andre Elisseeff, Pedro Valenzuela, Kay McKinney, Kim Paterson, Albert Cui, Eri Latorre-Chimoto, Solomon Kim, William Zeng, Ken Durden, Priya Ponnapalli, Tiberiu Sosea, Christopher A. Choquette-Choo, James Manyika, Brona Robenek, Harsha Vashisht, Sebastien Pereira, Hoi Lam, Marko Velic, Denese Owusu-Afriyie, Katherine Lee, Tolga Bolukbasi, Alicia Parrish, Shawn Lu, Jane Park, Balaji Venkatraman, Alice Talbert, Lambert Rosique, Yuchung Cheng, Andrei Sozanschi, Adam Paszke, Praveen Kumar, Jessica Austin, Lu Li, Khalid Salama, Bartek Perz, Wooyeol Kim, Nandita Dukkipati, Anthony Baryshnikov, Christos Kaplanis, XiangHai Sheng, Yuri Chervonyi, Caglar Unlu, Diego de Las Casas, Harry Askham, Kathryn Tunyasuvunakool, Felix Gimeno, Siim Poder, Chester Kwak, Matt Miecnikowski, Vahab Mirrokni, Alek Dimitriev, Aaron Parisi, Dangyi Liu, Tomy Tsai, Toby Shevlane, Christina Kouridi, Drew Garmon, Adrian Goedeckemeyer, Adam R. Brown, Anitha Vijayakumar, Ali Elqursh, Sadegh Jazayeri, Jin Huang, Sara Mc Carthy, Jay Hoover, Lucy Kim, Sandeep Kumar, Wei Chen, Courtney Biles, Garrett Bingham, Evan Rosen, Lisa Wang, Qijun Tan, David Engel, Francesco Pongetti, Dario de Cesare, Dongseong Hwang, Lily Yu, Jennifer Pullman, Srini Narayanan, Kyle Levin, Siddharth Gopal, Megan Li, Asaf Aharoni, Trieu Trinh, Jessica Lo, Norman Casagrande, Roopali Vij, Loic Matthey, Bramandia Ramadhana, Austin Matthews, CJ Carey, Matthew Johnson, Kremena Goranova, Rohin Shah, Shereen Ashraf, Kingshuk Dasgupta, Rasmus Larsen, Yicheng Wang, Manish Reddy Vuyyuru, Chong Jiang, Joana Ijazi, Kazuki Osawa, Celine Smith, Ramya Sree Boppana, Taylan Bilal, Yuma Koizumi, Ying Xu, Yasemin Altun, Nir Shabat, Ben Bariach, Alex Korchemniy, Kiam Choo, Olaf Ronneberger, Chimezie Iwuanyanwu, Shubin Zhao, David Soergel, Cho-Jui Hsieh, Irene Cai, Shariq Iqbal, Martin Sundermeyer, Zhe Chen, Elie Bursztein, Chaitanya Malaviya, Fadi Biadsy, Prakash Shroff, Inderjit Dhillon, Tejasi Latkar, Chris Dyer, Hannah Forbes, Massimo Nicosia, Vitaly Nikolaev, Somer Greene, Marin Georgiev, Pidong Wang, Nina Martin, Hanie Sedghi, John Zhang, Praseem Banzal, Doug Fritz, Vikram Rao, Xuezhi Wang, Jiageng Zhang, Viorica Patraucean, Dayou Du, Igor Mordatch, Ivan Jurin, Lewis Liu, Ayush Dubey, Abhi Mohan, Janek Nowakowski, Vlad-Doru Ion, Nan Wei, Reiko Tojo, Maria Abi Raad, Drew A. Hudson, Vaishakh Keshava, Shubham Agrawal, Kevin Ramirez, Zhichun Wu, Hoang Nguyen, Ji Liu, Madhavi Sewak, Bryce Petrini, DongHyun Choi, Ivan Philips, Ziyue Wang, Ioana Bica, Ankush Garg, Jarek Wilkiewicz, Priyanka Agrawal, Xiaowei Li, Danhao Guo, Emily Xue, Naseer Shaik, Andrew Leach, Sadh MNM Khan, Julia Wiesinger, Sammy Jerome, Abhishek Chakladar, Alek Wenjiao Wang, Tina Ornduff, Folake Abu, Alireza Ghaffarkhah, Marcus Wainwright, Mario Cortes, Frederick Liu, Joshua Maynez, Andreas Terzis, Pouya Samangouei, Riham Mansour, Tomasz Kępa, François-Xavier Aubet, Anton Algymr, Dan Banica, Agoston Weisz, Andras Orban, Alexandre Senges, Ewa Andrejczuk, Mark Geller, Niccolo Dal Santo, Valentin Anklin, Majd Al Merey, Martin Baeuml, Trevor Strohman, Junwen Bai, Slav Petrov, Yonghui Wu, Demis Hassabis, Koray Kavukcuoglu, Jeff Dean, Oriol Vinyals

In this report, we introduce the Gemini 1. 5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

1 Image, 2*2 Stitching Code Generation +8

Large Convolutional Model Tuning via Filter Subspace

no code implementations1 Mar 2024 Wei Chen, Zichen Miao, Qiang Qiu

Furthermore, each filter atom can be recursively decomposed as a combination of another set of atoms, which naturally expands the number of tunable parameters in the filter subspace.

model

Graph Diffusion Policy Optimization

1 code implementation26 Feb 2024 Yijing Liu, Chao Du, Tianyu Pang, Chongxuan Li, Min Lin, Wei Chen

Recent research has made significant progress in optimizing diffusion models for downstream objectives, which is an important pursuit in fields such as graph generation for drug design.

Drug Design Graph Generation

Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

1 code implementation21 Feb 2024 Zhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu

The surge in Large Language Models (LLMs) has revolutionized natural language processing, but fine-tuning them for specific tasks often encounters challenges in balancing performance and preserving general instruction-following abilities.

Instruction Following Language Modeling +2

Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective

1 code implementation18 Feb 2024 Jiaxi Hu, Yuehong Hu, Wei Chen, Ming Jin, Shirui Pan, Qingsong Wen, Yuxuan Liang

In long-term time series forecasting (LTSF) tasks, an increasing number of models have acknowledged that discrete time series originate from continuous dynamic systems and have attempted to model their dynamical structures.

Time Series Time Series Forecasting

AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator

1 code implementation15 Feb 2024 Zhihao Fan, Jialong Tang, Wei Chen, Siyuan Wang, Zhongyu Wei, Jun Xi, Fei Huang, Jingren Zhou

Artificial intelligence has significantly advanced healthcare, particularly through large language models (LLMs) that excel in medical question answering benchmarks.

Benchmarking Diagnostic +1

Model Compression and Efficient Inference for Large Language Models: A Survey

no code implementations15 Feb 2024 Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, Xiaofei He

However, Large language models have two prominent characteristics compared to smaller models: (1) Most of compression algorithms require finetuning or even retraining the model after compression.

Knowledge Distillation Model Compression +1

Prismatic: Interactive Multi-View Cluster Analysis of Concept Stocks

no code implementations14 Feb 2024 Wong Kam-Kwai, Yan Luo, Xuanwu Yue, Wei Chen, Huamin Qu

Financial cluster analysis allows investors to discover investment alternatives and avoid undertaking excessive risks.

Clustering

AgentLens: Visual Analysis for Agent Behaviors in LLM-based Autonomous Systems

no code implementations14 Feb 2024 Jiaying Lu, Bo Pan, Jieyi Chen, Yingchaojie Feng, Jingyuan Hu, Yuchen Peng, Wei Chen

Recently, Large Language Model based Autonomous system(LLMAS) has gained great popularity for its potential to simulate complicated behaviors of human societies.

Language Modeling Language Modelling +1

A Unified Causal View of Instruction Tuning

no code implementations9 Feb 2024 Lu Chen, Wei Huang, Ruqing Zhang, Wei Chen, Jiafeng Guo, Xueqi Cheng

The key idea is to learn task-required causal factors and only use those to make predictions for a given task.

FedAA: A Reinforcement Learning Perspective on Adaptive Aggregation for Fair and Robust Federated Learning

2 code implementations8 Feb 2024 Jialuo He, Wei Chen, Xiaojin Zhang

Federated Learning (FL) has emerged as a promising approach for privacy-preserving model training across decentralized devices.

continuous-control Continuous Control +4

Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy

no code implementations7 Feb 2024 Ruichu Cai, Siyang Huang, Jie Qiao, Wei Chen, Yan Zeng, Keli Zhang, Fuchun Sun, Yang Yu, Zhifeng Hao

As a key component to intuitive cognition and reasoning solutions in human intelligence, causal knowledge provides great potential for reinforcement learning (RL) agents' interpretability towards decision-making by helping reduce the searching space.

Decision Making Reinforcement Learning (RL)

Interpretable Multi-Source Data Fusion Through Latent Variable Gaussian Process

no code implementations6 Feb 2024 Sandipp Krishnan Ravi, Yigitcan Comlek, Arjun Pathak, Vipul Gupta, Rajnikant Umretiya, Andrew Hoffman, Ghanshyam Pilania, Piyush Pandita, Sayan Ghosh, Nathaniel Mckeever, Wei Chen, Liping Wang

Additionally, a dissimilarity metric based on the latent variables of LVGP is introduced to study and understand the differences in the sources of data.

Position: What Can Large Language Models Tell Us about Time Series Analysis

1 code implementation5 Feb 2024 Ming Jin, Yifan Zhang, Wei Chen, Kexin Zhang, Yuxuan Liang, Bin Yang, Jindong Wang, Shirui Pan, Qingsong Wen

Time series analysis is essential for comprehending the complexities inherent in various realworld systems and applications.

Decision Making Position +3

EasyFS: an Efficient Model-free Feature Selection Framework via Elastic Transformation of Features

no code implementations4 Feb 2024 Jianming Lv, Sijun Xia, Depin Liang, Wei Chen

Traditional model-free feature selection methods treat each feature independently while disregarding the interrelationships among features, which leads to relatively poor performance compared with the model-aware methods.

feature selection

EMN: Brain-inspired Elastic Memory Network for Quick Domain Adaptive Feature Mapping

no code implementations4 Feb 2024 Jianming Lv, ChengJun Wang, Depin Liang, Qianli Ma, Wei Chen, Xueqi Cheng

Inspired by the memory mechanism and powerful generalization ability of biological neural networks in human brains, we propose a novel gradient-free Elastic Memory Network, namely EMN, to support quick fine-tuning of the mapping between features and prediction without heavy optimization of deep features.

Memorization Unsupervised Domain Adaptation

Shapelet-based Model-agnostic Counterfactual Local Explanations for Time Series Classification

no code implementations2 Feb 2024 Qi Huang, Wei Chen, Thomas Bäck, Niki van Stein

In this work, we propose a model-agnostic instance-based post-hoc explainability method for time series classification.

Classification counterfactual +2

Query-Efficient Correlation Clustering with Noisy Oracle

no code implementations2 Feb 2024 Yuko Kuroki, Atsushi Miyauchi, Francesco Bonchi, Wei Chen

We study a general clustering setting in which we have $n$ elements to be clustered, and we aim to perform as few queries as possible to an oracle that returns a noisy sample of the weighted similarity between two elements.

Clustering Multi-Armed Bandits

CauESC: A Causal Aware Model for Emotional Support Conversation

no code implementations31 Jan 2024 Wei Chen, Hengxu Lin, Qun Zhang, Xiaojin Zhang, Xiang Bai, Xuanjing Huang, Zhongyu Wei

Emotional Support Conversation aims at reducing the seeker's emotional distress through supportive response.

Synthetic data enables faster annotation and robust segmentation for multi-object grasping in clutter

no code implementations24 Jan 2024 Dongmyoung Lee, Wei Chen, Nicolas Rojas

In this work, we propose a synthetic data generation method that minimizes human intervention and makes downstream image segmentation algorithms more robust by combining a generated synthetic dataset with a smaller real-world dataset (hybrid dataset).

Dataset Generation Image Segmentation +8

Cascading Reinforcement Learning

no code implementations17 Jan 2024 Yihan Du, R. Srikant, Wei Chen

In the cascading bandit model, at each timestep, an agent recommends an ordered subset of items (called an item list) from a pool of items, each associated with an unknown attraction probability.

Recommendation Systems reinforcement-learning +1

Attention-Based CNN-BiLSTM for Sleep State Classification of Spatiotemporal Wide-Field Calcium Imaging Data

1 code implementation16 Jan 2024 Xiaohui Zhang, Eric C. Landsness, Hanyang Miao, Wei Chen, Michelle Tang, Lindsey M. Brier, Joseph P. Culver, Jin-Moo Lee, Mark A. Anastasio

Comparison with Existing Method: On a 3-hour WFCI recording, the CNN-BiLSTM achieved a kappa of 0. 67, comparable to a kappa of 0. 65 corresponding to the human EEG/EMG-based scoring.

EEG

The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023

2 code implementations7 Jan 2024 He Wang, Pengcheng Guo, Wei Chen, Pan Zhou, Lei Xie

This paper delineates the visual speech recognition (VSR) system introduced by the NPU-ASLP-LiAuto (Team 237) in the first Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) 2023, engaging in the fixed and open tracks of Single-Speaker VSR Task, and the open track of Multi-Speaker VSR Task.

Decoder speech-recognition +1

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge

no code implementations7 Jan 2024 He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, BinBin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li

To promote speech processing and recognition research in driving scenarios, we build on the success of the Intelligent Cockpit Speech Recognition Challenge (ICSRC) held at ISCSLP 2022 and launch the ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Large Language Models for Generative Information Extraction: A Survey

1 code implementation29 Dec 2023 Derong Xu, Wei Chen, Wenjun Peng, Chao Zhang, Tong Xu, Xiangyu Zhao, Xian Wu, Yefeng Zheng, Yang Wang, Enhong Chen

Information extraction (IE) aims to extract structural knowledge from plain natural language texts.

Survey

MolSets: Molecular Graph Deep Sets Learning for Mixture Property Modeling

1 code implementation27 Dec 2023 Hengrui Zhang, Jie Chen, James M. Rondinelli, Wei Chen

This complexity is particularly evident in molecular mixtures, a frequently explored space for materials such as battery electrolytes.

Graph Neural Network mixture property prediction +1

Identification of Causal Structure with Latent Variables Based on Higher Order Cumulants

no code implementations19 Dec 2023 Wei Chen, Zhiyi Huang, Ruichu Cai, Zhifeng Hao, Kun Zhang

Despite the emergence of numerous methods aimed at addressing this challenge, they are not fully identified to the structure that two observed variables are influenced by one latent variable and there might be a directed edge in between.

Causal Discovery

Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off

no code implementations16 Dec 2023 Yu-An Liu, Ruqing Zhang, Mingkun Zhang, Wei Chen, Maarten de Rijke, Jiafeng Guo, Xueqi Cheng

We decompose the robust ranking error into two components, i. e., a natural ranking error for effectiveness evaluation and a boundary ranking error for assessing adversarial robustness.

Adversarial Robustness Information Retrieval

K-ESConv: Knowledge Injection for Emotional Support Dialogue Systems via Prompt Learning

no code implementations16 Dec 2023 Wei Chen, Gang Zhao, Xiaojin Zhang, Xiang Bai, Xuanjing Huang, Zhongyu Wei

Automatic psychological counseling requires mass of professional knowledge that can be found in online counseling forums.

Diversity Response Generation

Enlighten-Your-Voice: When Multimodal Meets Zero-shot Low-light Image Enhancement

no code implementations15 Dec 2023 Xiaofeng Zhang, Zishan Xu, Hao Tang, Chaochen Gu, Wei Chen, Shanying Zhu, Xinping Guan

Low-light image enhancement is a crucial visual task, and many unsupervised methods tend to overlook the degradation of visible information in low-light scenes, which adversely affects the fusion of complementary information and hinders the generation of satisfactory results.

Low-Light Image Enhancement

Generative Inverse Design of Metamaterials with Functional Responses by Interpretable Learning

1 code implementation8 Dec 2023 Wei "Wayne" Chen, Rachel Sun, Doksoo Lee, Carlos M. Portela, Wei Chen

Unlike data-intensive and non-interpretable deep-learning-based methods, we propose the Random-forest-based Interpretable Generative Inverse Design (RIGID), a single-shot inverse design method for fast generation of metamaterial designs with on-demand functional behaviors.

Interpretable Machine Learning

Local-Global History-aware Contrastive Learning for Temporal Knowledge Graph Reasoning

no code implementations4 Dec 2023 Wei Chen, Huaiyu Wan, Yuting Wu, Shuyuan Zhao, Jiayaqi Cheng, Yuxin Li, Youfang Lin

Temporal knowledge graphs (TKGs) have been identified as a promising approach to represent the dynamics of facts along the timeline.

Contrastive Learning Knowledge Graphs

A Cyclic Small Phase Theorem

no code implementations1 Dec 2023 Chao Chen, Wei Chen, Di Zhao, Jianqi Chen, Li Qiu

This paper introduces a brand-new phase definition called the segmental phase for multi-input multi-output linear time-invariant systems.

Phase Preservation of N-Port Networks under General Connections

no code implementations28 Nov 2023 Jianqi Chen, Wei Chen, Chao Chen, Li Qiu

In addition, the inverse operations of the considered connections, that is, network subtractions with correspondences are examined.

Statistical Parameterized Physics-Based Machine Learning Digital Twin Models for Laser Powder Bed Fusion Process

no code implementations14 Nov 2023 Yangfan Li, Satyajit Mojumder, Ye Lu, Abdullah Al Amin, Jiachen Guo, Xiaoyu Xie, Wei Chen, Gregory J. Wagner, Jian Cao, Wing Kam Liu

In the context of laser powder bed fusion (LPBF) additive manufacturing, a digital twin of the manufacturing process can offer predictions for the produced parts, diagnostics for manufacturing defects, as well as control capabilities.

Diagnostic

CAME: Competitively Learning a Mixture-of-Experts Model for First-stage Retrieval

no code implementations6 Nov 2023 Yinqiong Cai, Yixing Fan, Keping Bi, Jiafeng Guo, Wei Chen, Ruqing Zhang, Xueqi Cheng

The first-stage retrieval aims to retrieve a subset of candidate documents from a huge collection both effectively and efficiently.

Retrieval

Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity

no code implementations27 Oct 2023 Bohan Wang, Jingwen Fu, Huishuai Zhang, Nanning Zheng, Wei Chen

Recently, Arjevani et al. [1] established a lower bound of iteration complexity for the first-order optimization under an $L$-smooth condition and a bounded noise variance assumption.

LEMMA valid

Cannot find the paper you are looking for? You can Submit a new open access paper.