1 code implementation • 19 May 2025 • Shiao Wang, Xiao Wang, Liye Jin, Bo Jiang, Lin Zhu, Lan Chen, Yonghong Tian, Bin Luo
Existing tracking algorithms typically rely on low-frame-rate RGB cameras coupled with computationally intensive deep neural network architectures to achieve effective tracking.
1 code implementation • 19 May 2025 • Xiao Wang, Yu Jin, Lan Chen, Bo Jiang, Lin Zhu, Yonghong Tian, Jin Tang, Bin Luo
To address these issues, this paper proposes a novel dynamic graph induced contour-aware heat conduction network for event stream based object detection, termed CvHeat-DET.
no code implementations • 19 May 2025 • Jack Chen, Fazhong Liu, Naruto Liu, Yuhan Luo, Erqu Qin, Harry Zheng, Tian Dong, Haojin Zhu, Yan Meng, Xiao Wang
Large language models (LLMs) excel at mathematical reasoning and logical problem-solving.
1 code implementation • 18 May 2025 • Yang Hu, Xingyu Zhang, Xueji Fang, Zhiyang Chen, Xiao Wang, Huatian Zhang, GuoJun Qi
We propose SLOT (Sample-specific Language Model Optimization at Test-time), a novel and parameter-efficient test-time inference approach that enhances a language model's ability to more accurately respond to individual prompts.
no code implementations • 17 May 2025 • Xiao Wang, Shun-Ren Yang
Additionally, the optimal frequency for rotational position encoding is determined through a grid search approach in both the spatial and temporal attention mechanisms.
no code implementations • 9 May 2025 • Xi Xiao, Yunbei Zhang, Thanh-Huy Nguyen, Ba-Thinh Lam, Janet Wang, Jihun Hamm, Tianyang Wang, Xingjian Li, Xiao Wang, Hao Xu, Tianming Liu, Min Xu
Localized image captioning has made significant progress with models like the Describe Anything Model (DAM), which can generate detailed region-specific descriptions without explicit region-text supervision.
no code implementations • 7 May 2025 • Xiao Wang, Jong-Youl Choi, Takuya Kurihaya, Isaac Lyngaas, Hong-Jun Yoon, Ming Fan, Nasik Muhammad Nafi, Aristeidis Tsaris, Ashwin M. Aji, Maliha Hossain, Mohamed Wahib, Dali Wang, Peter Thornton, Prasanna Balaprakash, Moetasim Ashfaq, Dan Lu
It supports downscaling to 0. 9 km global resolution and processes sequences up to 4. 2 billion tokens.
no code implementations • 23 Apr 2025 • Yuxiang Wei, Yanteng Zhang, Xi Xiao, Tianyang Wang, Xiao Wang, Vince D. Calhoun
Multimodal neuroimaging provides complementary structural and functional insights into both human brain organization and disease-related dynamics.
no code implementations • 22 Apr 2025 • Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Junyuan Mao, Yu Wang, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Wenjie Qu, Yue Liu, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Zhaoxin Fan, Yi Ding, Donghai Hong, Jiaming Ji, Yingxin Lai, Zitong Yu, Xinfeng Li, Yifan Jiang, Yanhui Li, Xinyu Deng, Junlin Wu, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Qiufeng Wang, Wenxuan Wang, Dongrui Liu, Yanwei Yue, Wenke Huang, Guancheng Wan, Heng Chang, Tianlin Li, Yi Yu, Chenghao Li, Jiawei Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Jiaheng Zhang, Tianwei Zhang, Xingjun Ma, Jindong Gu, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Lingjuan Lyu, Yuval Elovici, Bhavya Kailkhura, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, XiaoFeng Wang, DaCheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu
Currently, existing surveys on LLM safety primarily focus on specific stages of the LLM lifecycle, e. g., deployment phase or fine-tuning phase, lacking a comprehensive understanding of the entire "lifechain" of LLMs.
1 code implementation • 19 Apr 2025 • Qiang Chen, Xiao Wang, Haowen Wang, Bo Jiang, Lin Zhu, Dawei Zhang, Yonghong Tian, Jin Tang
To bridge this gap, in this paper, we propose a cross-modal adversarial attack algorithm for RGB-Event visual tracking.
1 code implementation • 17 Apr 2025 • Wentao Wu, Xiao Wang, Chenglong Li, Bo Jiang, Jin Tang, Bin Luo, Qi Liu
Event cameras have attracted increasing attention in recent years due to their advantages in high dynamic range, high temporal resolution, low power consumption, and low latency.
no code implementations • 14 Apr 2025 • Jingyun Yang, Ruoyan Avery Yin, Chi Jiang, Yuepeng Hu, Xiaokai Zhu, Xingjian Hu, Sutharsika Kumar, Xiao Wang, Xiaohua Zhai, Keran Rong, Yunyue Zhu, Tianyi Zhang, Zongyou Yin, Jing Kong, Neil Zhenqiang Gong, Zhichu Ren, Haozhe Wang
This work represents the implementation of foundation models to achieve autonomous analysis, establishing a scalable and data-efficient characterization paradigm that fundamentally transforms the approach to nanoscale materials research.
1 code implementation • 14 Apr 2025 • Xiao Wang, Haiyang Wang, Shiao Wang, Qiang Chen, Jiandong Jin, Haoyu Song, Bo Jiang, Chenglong Li
In this paper, we revisit these issues and propose a novel multi-modal RGB-Event attribute recognition task by drawing inspiration from the advantages of event cameras in low-light, high-speed, and low-power consumption.
no code implementations • 13 Apr 2025 • Qiankun Shi, Xiao Wang, Hao Wang
In particular, starting from a near-feasible initial point and using Rademacher smoothing, the oracle complexity is in order \(O(p d^{2/p} \epsilon^{-3})\) for \(p \in [2, 2 \ln d]\), and \(O(\ln d \cdot \epsilon^{-3})\) for \(p > 2 \ln d\), where \(d\) denotes the problem dimension.
1 code implementation • 8 Apr 2025 • Shiao Wang, Xiao Wang, Bo Jiang, Lin Zhu, Guoqi Li, YaoWei Wang, Yonghong Tian, Jin Tang
In this work, we rethink human activity recognition by combining the RGB and event cameras.
no code implementations • 2 Apr 2025 • Xiao Wang, Daniil Larionov, Siwei Wu, Yiqi Liu, Steffen Eger, Nafise Sadat Moosavi, Chenghua Lin
In this work, we introduce ContrastScore, a contrastive evaluation metric designed to enable higher-quality, less biased, and more efficient assessment of generated text.
no code implementations • 22 Mar 2025 • Xi Xiao, Yunbei Zhang, Yanshuh Li, Xingjian Li, Tianyang Wang, Jihun Hamm, Xiao Wang, Min Xu
Parameter-efficient fine-tuning (PEFT) has emerged as a crucial approach for adapting large vision transformers to downstream tasks without the prohibitive computational costs of full fine-tuning.
no code implementations • 20 Mar 2025 • Xiao Wang, Hendrik Borras, Bernhard Klein, Holger Fröning
One of the most effective techniques for enhancing robustness, Noisy Training, introduces noise during the training phase to reinforce the model against disturbances encountered during inference.
no code implementations • 20 Mar 2025 • Qiankun Shi, Jie Peng, Kun Yuan, Xiao Wang, Qing Ling
We establish the lower bounds on the Byzantine error and on the minimum number of queries to a stochastic gradient oracle required to achieve an arbitrarily small optimization error.
no code implementations • 19 Mar 2025 • Hang Li, Xiao Wang, Bevan Koopman, Guido Zuccon
Pseudo-relevance feedback (PRF) refines queries by leveraging initially retrieved documents to improve retrieval effectiveness.
no code implementations • 18 Mar 2025 • Qiang Qi, Xiao Wang
Video object detection has made significant progress in recent years thanks to convolutional neural networks (CNNs) and vision transformers (ViTs).
Ranked #5 on
Video Object Detection
on ImageNet VID
no code implementations • 17 Mar 2025 • Zhicheng Zhao, Jinquan Yan, Chenglong Li, Xiao Wang, Jin Tang
Optical remote sensing image dehazing presents significant challenges due to its extensive spatial scale and highly non-uniform haze distribution, which traditional single-image dehazing methods struggle to address effectively.
1 code implementation • 16 Mar 2025 • Xiao Wang, Qingyi Si, Jianlong Wu, Shiyu Zhu, Li Cao, Liqiang Nie
Multimodal Large Language Models (MLLMs) have revolutionized video understanding, yet are still limited by context length when processing long videos.
no code implementations • 11 Mar 2025 • Chenrui Ma, Rongchang Zhao, Xi Xiao, Hongyang Xie, Tianyang Wang, Xiao Wang, Hao Zhang, Yanning Shen
While deep generative models have significantly advanced representation learning, they may inherit or amplify biases and fairness issues by encoding sensitive attributes alongside predictive features.
no code implementations • 10 Mar 2025 • Wentao Wu, Chenglong Li, Xiao Wang, Bin Luo, Qi Liu
To address this problem, we propose a Large Language Model (LLM) guided Progressive feature Alignment Network called LPANet, which leverages the semantic features extracted from a large language model to guide the progressive semantic and spatial alignment between modalities for multimodal UAV object detection.
1 code implementation • 9 Mar 2025 • Xiao Wang, Yuehang Li, Fuling Wang, Bo Jiang, YaoWei Wang, Yonghong Tian, Jin Tang, Bin Luo
Accurate sign language understanding serves as a crucial communication channel for individuals with disabilities.
no code implementations • 9 Mar 2025 • Xiao Wang, Lu Dong, Sahana Rangasrinivasan, Ifeoma Nwogu, Srirangaraj Setlur, Venugopal Govindaraju
The social robot's open API allows users to customize open-domain interactions.
no code implementations • 1 Mar 2025 • Lixu Wang, Bingqi Shang, Yi Li, Payal Mohapatra, Wei Dong, Xiao Wang, Qi Zhu
SA, inspired by split learning (SL), segments the pre-trained ViT into a frontend and a backend, with only the frontend shared with the client for data representation extraction.
no code implementations • 28 Feb 2025 • Xiao Wang, Jingyun Hua, WeiHong Lin, Yuanxing Zhang, Fuzheng Zhang, Jianlong Wu, Di Zhang, Liqiang Nie
Recent Multi-modal Large Language Models (MLLMs) have made great progress in video understanding.
no code implementations • 26 Feb 2025 • Yi Feng, Xiao Wang, Tian Xie
We consider nonconvex optimization problem over simplex, and more generally, a product of simplices.
1 code implementation • 24 Feb 2025 • Yuming Yang, Yang Nan, Junjie Ye, Shihan Dou, Xiao Wang, Shuo Li, Huijie Lv, Tao Gui, Qi Zhang, Xuanjing Huang
To address this, we systematically analyze 11 existing diversity measurement methods by assessing their correlation with model performance through extensive fine-tuning experiments.
1 code implementation • 20 Feb 2025 • Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, Olivier Hénaff, Jeremiah Harmsen, Andreas Steiner, Xiaohua Zhai
We introduce SigLIP 2, a family of new multilingual vision-language encoders that build on the success of the original SigLIP.
no code implementations • 19 Feb 2025 • Guangzhi Xiong, Qiao Jin, Xiao Wang, Yin Fang, Haolin Liu, Yifan Yang, Fangyuan Chen, Zhixing Song, Dengyu Wang, Minjia Zhang, Zhiyong Lu, Aidong Zhang
Retrieval-augmented generation (RAG) has shown great potential for knowledge-intensive tasks, but its traditional architectures rely on static retrieval, limiting their effectiveness for complex questions that require sequential information-seeking.
1 code implementation • 13 Feb 2025 • Chuanhui Liu, Xiao Wang
This paper provides a comprehensive analysis of variational inference in latent variable models for survival analysis, emphasizing the distinctive challenges associated with applying variational methods to survival data.
1 code implementation • 13 Feb 2025 • Xiao Wang, Jingtao Jiang, Dong Li, Futian Wang, Lin Zhu, YaoWei Wang, Yongyong Tian, Jin Tang
Mainstream Scene Text Recognition (STR) algorithms are developed based on RGB cameras which are sensitive to challenging factors such as low illumination, motion blur, and cluttered backgrounds.
no code implementations • 11 Feb 2025 • Xiao Wang, Ibrahim Alabdulmohsin, Daniel Salz, Zhe Li, Keran Rong, Xiaohua Zhai
We provide an empirical investigation of the potential of pre-training vision-language models on an unprecedented scale: 100 billion examples.
1 code implementation • 8 Feb 2025 • Shiao Wang, Xiao Wang, Chao Wang, Liye Jin, Lin Zhu, Bo Jiang, Yonghong Tian, Jin Tang
We then introduce a novel hierarchical knowledge distillation strategy that incorporates the similarity matrix, feature representation, and response map-based distillation to guide the learning of the student Transformer network.
no code implementations • 3 Feb 2025 • Maliha Hossain, Yuankai Huo, Xinqiang Yan, Xiao Wang
Instead of directly using a 3D prior, this work proposes a BM3D Multi Slice Fusion (BM3D-MSF) prior that uses multiple 2D image denoisers fused to act as a fully 3D prior model in Plug and Play reconstruction approach.
no code implementations • 24 Jan 2025 • Xiao Wang, Hendrik Borras, Bernhard Klein, Holger Fröning
This work investigates the effectiveness of training neural networks with quantization to increase the robustness against noise.
no code implementations • 21 Jan 2025 • Zhengyi Lu, Hao Liang, Ming Lu, Xiao Wang, Xinqiang Yan, Yuankai Huo
This approach offers a faster and more efficient solution to RF shimming challenges in UHF MRI.
no code implementations • 17 Jan 2025 • Futian Wang, Fengxiang Liu, Xiao Wang
In the realm of multi-object tracking, the challenge of accurately capturing the spatial and temporal relationships between objects in video sequences remains a significant hurdle.
1 code implementation • 10 Jan 2025 • Yanfan Zhu, Issac Lyngaas, Murali Gopalakrishnan Meena, Mary Ellen I. Koran, Bradley Malin, Daniel Moyer, Shunxing Bao, Anuj Kapadia, Xiao Wang, Bennett Landman, Yuankai Huo
A prominent method within this area, called Unlearnable Clustering (UC), has shown improved UE performance with larger batch sizes but was previously limited by computational resources.
1 code implementation • 7 Jan 2025 • Xiao Wang, Fuling Wang, Haowen Wang, Bo Jiang, Chuanfu Li, YaoWei Wang, Yonghong Tian, Jin Tang
X-ray image based medical report generation achieves significant progress in recent years with the help of the large language model, however, these models have not fully exploited the effective information in visual image regions, resulting in reports that are linguistically sound but insufficient in describing key diseases.
1 code implementation • 6 Jan 2025 • Zhongjian Zhang, Mengmei Zhang, Xiao Wang, Lingjuan Lyu, Bo Yan, Junping Du, Chuan Shi
Unlike FL, FR has a unique sparse aggregation mechanism, where the embedding of each item is updated by only partial clients, instead of full clients in a dense aggregation of general FL.
no code implementations • 30 Dec 2024 • Haitian Chen, Qingyao Ai, Xiao Wang, Yiqun Liu, Fen Lin, Qin Liu
In response to these challenges, we propose to improve the robustness of dense retrieval models by enhancing their sensitivity of fine-graned relevance signals.
1 code implementation • 29 Dec 2024 • Xiao Wang, Qingyi Si, Jianlong Wu, Shiyu Zhu, Li Cao, Liqiang Nie
Video Large Language Models (VideoLLMs) have made significant strides in video understanding but struggle with long videos due to the limitations of their backbone LLMs.
1 code implementation • 28 Dec 2024 • Lan Chen, Haoxiang Yang, Pengpeng Shao, Haoyu Song, Xiao Wang, Zhicheng Zhao, YaoWei Wang, Yonghong Tian
Inspired by the successful application of large models, the introduction of such large models can also be considered to further enhance the performance of multi-modal tasks.
no code implementations • 20 Dec 2024 • Chenyi Cai, Biao Li, Qiyan Zhang, Xiao Wang, Filip Biljecki, Pieter Herthogs
This paper highlights the importance of establishing a bi-directional mapping between morphology metrics and complex urban form to enable the integration of urban form generation with performance evaluation.
1 code implementation • 17 Dec 2024 • Mingxu Chai, Ziyu Shen, Chong Zhang, Yue Zhang, Xiao Wang, Shihan Dou, Jihua Kang, Jiazheng Zhang, Qi Zhang
Document parsing is essential for analyzing complex document structures and extracting fine-grained information, supporting numerous downstream applications.
1 code implementation • 9 Dec 2024 • Xiao Wang, Yu Jin, Wentao Wu, Wei zhang, Lin Zhu, Bo Jiang, Yonghong Tian
Object detection in event streams has emerged as a cutting-edge research area, demonstrating superior performance in low-light conditions, scenarios with motion blur, and rapid movements.
no code implementations • 9 Dec 2024 • Kentaroh Toyoda, Xiao Wang, Mingzhe Li, Bo Gao, YuAn Wang, Qingsong Wei
Blockchain data analysis is essential for deriving insights, tracking transactions, identifying patterns, and ensuring the integrity and security of decentralized networks.
1 code implementation • 4 Dec 2024 • Andreas Steiner, André Susano Pinto, Michael Tschannen, Daniel Keysers, Xiao Wang, Yonatan Bitton, Alexey Gritsenko, Matthias Minderer, Anthony Sherbondy, Shangbang Long, Siyang Qin, Reeve Ingle, Emanuele Bugliarello, Sahar Kazemzadeh, Thomas Mesnard, Ibrahim Alabdulmohsin, Lucas Beyer, Xiaohua Zhai
PaliGemma 2 is an upgrade of the PaliGemma open Vision-Language Model (VLM) based on the Gemma 2 family of language models.
no code implementations • 25 Nov 2024 • Zhiheng Xi, Dingwen Yang, Jixuan Huang, Jiafu Tang, Guanyu Li, Yiwen Ding, wei he, Boyang Hong, Shihan Do, WenYu Zhan, Xiao Wang, Rui Zheng, Tao Ji, Xiaowei Shi, Yitao Zhai, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Zuxuan Wu, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Yu-Gang Jiang
Experiments show that the method improves the actor's exploration efficiency and solution diversity, especially on challenging queries, leading to a stronger reasoning model.
1 code implementation • 24 Nov 2024 • Zhengyi Li, Kang Yang, Jin Tan, Wen-jie Lu, Haoqi Wu, Xiao Wang, Yu Yu, Derun Zhao, Yancheng Zheng, Minyi Guo, Jingwen Leng
For the linear layer, we propose a new 2PC paradigm along with an encoding approach to securely compute matrix multiplications based on an outer-product insight, which achieves $2. 9\times \sim 12. 5\times$ performance improvements compared to the state-of-the-art (SOTA) protocol.
1 code implementation • 12 Nov 2024 • Yang Hu, Xiao Wang, Lirong Wu, Huatian Zhang, Stan Z. Li, Sheng Wang, Tianlong Chen
FM-TS is more efficient in terms of training and inference.
no code implementations • 21 Oct 2024 • Runkang Guo, Bin Chen, Qi Zhang, Yong Zhao, Xiao Wang, Zhengqiu Zhu
Our approach leverages the strengths of both physical models and PIML.
no code implementations • 12 Oct 2024 • Kexin Li, Luwei Bai, Xiao Wang, Hao Wang
Anderson acceleration is an effective technique for enhancing the efficiency of fixed-point iterations; however, analyzing its convergence in nonsmooth settings presents significant challenges.
no code implementations • 11 Oct 2024 • Shiao Wang, Yifeng Wang, Qingchuan Ma, Xiao Wang, Ning Yan, Qingquan Yang, Guosheng Xu, Jin Tang
Q-distribution prediction is a crucial research direction in controlled nuclear fusion, with deep learning emerging as a key approach to solving prediction challenges.
no code implementations • 11 Oct 2024 • Qingchuan Ma, Shiao Wang, Tong Zheng, Xiaodong Dai, Yifeng Wang, Qingquan Yang, Xiao Wang
This study addresses the critical challenge of predicting the Q-distribution in long-term stable nuclear fusion task, a key component for advancing clean energy solutions.
no code implementations • 10 Oct 2024 • Yumiao Zhao, Bo Jiang, Xiao Wang, Qin Xu, Jin Tang
To address these issues, in this paper, we propose a novel Heterogeneous Graph Adapter to achieve tuning VLMs for the downstream tasks.
1 code implementation • 10 Oct 2024 • Haiyang Wang, Qian Zhu, Mowen She, Yabo Li, Haoyu Song, Minghe Xu, Xiao Wang
To address this issue, in this paper, we propose a Spiking Neural Network (SNN) based framework for energy-efficient attribute recognition.
1 code implementation • 1 Oct 2024 • Xiao Wang, Fuling Wang, Yuehang Li, Qingchuan Ma, Shiao Wang, Bo Jiang, Chuanfu Li, Jin Tang
Thus, we conduct a comprehensive benchmarking of existing mainstream X-ray report generation models and large language models (LLMs), on the CheXpert Plus dataset.
no code implementations • 29 Sep 2024 • Xiao Wang, Jianlong Wu, Zijia Lin, Fuzheng Zhang, Di Zhang, Liqiang Nie
For iterative refinement, we first leverage a video-language model to generate synthetic annotations, resulting in a refined dataset.
1 code implementation • 27 Sep 2024 • Yixuan Qiu, Qingyi Gao, Xiao Wang
Generative models based on latent variables, such as generative adversarial networks (GANs) and variational auto-encoders (VAEs), have gained lots of interests due to their impressive performance in many fields.
no code implementations • 29 Aug 2024 • Lu Dong, Xiao Wang, Srirangaraj Setlur, Venu Govindaraju, Ifeoma Nwogu
Our experimental results demonstrate that our proposed method outperforms the state-of-the-art AffectNet VA estimation and RAF-DB classification tasks.
1 code implementation • 27 Aug 2024 • Siyuan Yao, Hao Sun, Tian-Zhu Xiang, Xiao Wang, Xiaochun Cao
In this paper, we propose a hierarchical graph interaction network termed HGINet for camouflaged object detection, which is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features.
1 code implementation • 23 Aug 2024 • Wentao Wu, Fanghua Hong, Xiao Wang, Chenglong Li, Jin Tang
In this work, we propose a new vehicle detection paradigm based on a pre-trained foundation vehicle model (VehicleMAE) and a large language model (T5), termed VFM-Det.
no code implementations • 21 Aug 2024 • Zhengyi Lu, Hao Liang, Xiao Wang, Xinqiang Yan, Yuankai Huo
We propose a two-step deep learning strategy.
1 code implementation • 20 Aug 2024 • Xiao Wang, Chao Wang, Shiao Wang, Xixi Wang, Zhicheng Zhao, Lin Zhu, Bo Jiang
More importantly, we consider introducing a dynamic template update strategy into the tracking framework using the Memory Mamba network.
1 code implementation • 20 Aug 2024 • Xiao Wang, Yao Rong, Fuling Wang, Jianing Li, Lin Zhu, Bo Jiang, YaoWei Wang
Based on this dataset and several other large-scale datasets, we propose a novel baseline method that fully leverages the Mamba model's ability to integrate temporal information of CNN features, resulting in improved sign language translation outcomes.
2 code implementations • 19 Aug 2024 • Jiandong Jin, Xiao Wang, Qian Zhu, Haiyang Wang, Chenglong Li
To address this issue, this paper proposes a new large-scale, cross-domain pedestrian attribute recognition dataset to fill the data gap, termed MSP60K.
1 code implementation • 19 Aug 2024 • Xiao Wang, Shiao Wang, Pengpeng Shao, Bo Jiang, Lin Zhu, Yonghong Tian
In this paper, we propose a large-scale, high-definition ($1280 \times 800$) human action recognition dataset based on the CeleX-V event camera, termed CeleX-HAR.
1 code implementation • 19 Aug 2024 • Xiao Wang, Yuehang Li, Fuling Wang, Shiao Wang, Chuanfu Li, Bo Jiang
They usually adopt a Transformer to extract the visual features of a given X-ray image, and then, feed them into the LLM for text generation.
1 code implementation • 16 Aug 2024 • Zhongjian Zhang, Xiao Wang, Huichi Zhou, Yue Yu, Mengmei Zhang, Cheng Yang, Chuan Shi
By presenting the empirical results, we find that despite that LLMs can improve the robustness of GNNs, there is still an average decrease of 23. 1% in accuracy, implying that the GNNs remain extremely vulnerable against topology attacks.
1 code implementation • 15 Aug 2024 • Xixi Wang, Zitian Wang, Jingtao Jiang, Lan Chen, Xiao Wang, Bo Jiang
We also introduce a motion augmented strategy that leverages motion cues as an additional output to aggregate with the spatial features for improved results.
no code implementations • 11 Aug 2024 • Bohao Xu, Yingzhou Lu, Chenhao Li, Ling Yue, Xiao Wang, Nan Hao, Tianfan Fu, Jim Chen
In drug discovery, predicting the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of small-molecule drugs is critical for ensuring safety and efficacy.
no code implementations • 6 Aug 2024 • Siqi Lu, Junlin Guo, James R Zimmer-Dauphinee, Jordan M Nieusma, Xiao Wang, Parker VanValkenburgh, Steven A Wernke, Yuankai Huo
Artificial Intelligence (AI) technologies have profoundly transformed the field of remote sensing, revolutionizing data collection, processing, and analysis.
1 code implementation • 1 Aug 2024 • Guangzhi Xiong, Qiao Jin, Xiao Wang, Minjia Zhang, Zhiyong Lu, Aidong Zhang
The emergent abilities of large language models (LLMs) have demonstrated great potential in solving medical questions.
no code implementations • 31 Jul 2024 • Pengjie Zhang, Lin Zhu, Xiao Wang, Lizhi Wang, Wanxuan Lu, Hua Huang
Specifically, our method utilizes a Temporal Recurrent Network to aggregate event features across temporal or spatial domains, and a Spatial Contextual Attention to enhance knowledge transfer across event flows via temporal or spatial interactions.
no code implementations • 15 Jul 2024 • Lin Zhu, Yunlong Zheng, Yijun Zhang, Xiao Wang, Lizhi Wang, Hua Huang
However, current methods often prioritize the extraction of temporal information from continuous event flow, leading to an overemphasis on low-frequency texture features in the scene, resulting in over-smoothing and blurry artifacts.
1 code implementation • 15 Jul 2024 • Xiao Wang, Weizhe Kong, Jiandong Jin, Shiao Wang, Ruichong Gao, Qingchuan Ma, Chenglong Li, Jin Tang
To further tap into the potential of the novel Mamba architecture for PAR tasks, this paper designs and adapts Mamba into two typical PAR frameworks, i. e., the text-image fusion approach and pure vision Mamba multi-label recognition framework.
1 code implementation • 14 Jul 2024 • Chongyang Gao, Lixu Wang, Kaize Ding, Chenkai Weng, Xiao Wang, Qi Zhu
The results indicate that OOO consistently achieves the best unlearning effectiveness and utility preservation, especially when facing continuous unlearning requests.
1 code implementation • 10 Jul 2024 • Lucas Beyer, Andreas Steiner, André Susano Pinto, Alexander Kolesnikov, Xiao Wang, Daniel Salz, Maxim Neumann, Ibrahim Alabdulmohsin, Michael Tschannen, Emanuele Bugliarello, Thomas Unterthiner, Daniel Keysers, Skanda Koppula, Fangyu Liu, Adam Grycner, Alexey Gritsenko, Neil Houlsby, Manoj Kumar, Keran Rong, Julian Eisenschlos, Rishabh Kabra, Matthias Bauer, Matko Bošnjak, Xi Chen, Matthias Minderer, Paul Voigtlaender, Ioana Bica, Ivana Balazevic, Joan Puigcerver, Pinelopi Papalampidi, Olivier Henaff, Xi Xiong, Radu Soricut, Jeremiah Harmsen, Xiaohua Zhai
PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model.
1 code implementation • 27 Jun 2024 • Lan Chen, Dong Li, Xiao Wang, Pengpeng Shao, Wei zhang, YaoWei Wang, Yonghong Tian, Jin Tang
In this paper, we propose a novel dual-stream framework for event stream-based pattern recognition via differentiated fusion, termed EFV++.
1 code implementation • 26 Jun 2024 • Caishuang Huang, Wanxu Zhao, Rui Zheng, Huijie Lv, WenYu Zhan, Shihan Dou, Sixian Li, Xiao Wang, Enyu Zhou, Junjie Ye, Yuming Yang, Tao Gui, Qi Zhang, Xuanjing Huang
As the development of large language models (LLMs) rapidly advances, securing these models effectively without compromising their utility has become a pivotal area of research.
no code implementations • 21 Jun 2024 • Jing Yang, Yu Zhao, Linyao Yang, Xiao Wang, Long Chen, Fei-Yue Wang
Temporal relation extraction (TRE) aims to grasp the evolution of events or actions, and thus shape the workflow of associated tasks, so it holds promise in helping understand task requests initiated by requesters in crowdsourcing systems.
1 code implementation • 21 Jun 2024 • Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge
While large language models (LLMs) have showcased impressive capabilities, they struggle with addressing legal queries due to the intricate complexities and specialized expertise required in the legal field.
1 code implementation • 19 Jun 2024 • Yudi Ruan, Hao Ma, Weikai Li, Xiao Wang
Low-light image enhancement (LLIE) is critical in computer vision.
1 code implementation • 17 Jun 2024 • Yuming Yang, Wantong Zhao, Caishuang Huang, Junjie Ye, Xiao Wang, Huiyuan Zheng, Yang Nan, Yuran Wang, Xueying Xu, Kaixin Huang, Yunke Zhang, Tao Gui, Qi Zhang, Xuanjing Huang
First, we detect inconsistent entity definitions across datasets and clarify them by distinguishable label names to construct a universal taxonomy of 400+ entity types.
1 code implementation • 17 Jun 2024 • Rong Bao, Rui Zheng, Shihan Dou, Xiao Wang, Enyu Zhou, Bo wang, Qi Zhang, Liang Ding, DaCheng Tao
In aligning large language models (LLMs), utilizing feedback from existing advanced AI rather than humans is an important method to scale supervisory signals.
no code implementations • 15 Jun 2024 • Yi Feng, Ping Li, Ioannis Panageas, Xiao Wang
Last-iterate behaviors of learning algorithms in repeated two-player zero-sum games have been extensively studied due to their wide applications in machine learning and related tasks.
no code implementations • 12 Jun 2024 • Jungeum Kim, Xiao Wang
Nonlinear dimensional reduction with the manifold assumption, often called manifold learning, has proven its usefulness in a wide range of high-dimensional data analysis.
1 code implementation • 10 Jun 2024 • Peng Xia, Ze Chen, Juanxi Tian, Yangrui Gong, Ruibo Hou, Yue Xu, Zhenbang Wu, Zhiyuan Fan, Yiyang Zhou, Kangyu Zhu, Wenhao Zheng, Zhaoyang Wang, Xiao Wang, Xuchao Zhang, Chetan Bansal, Marc Niethammer, Junzhou Huang, Hongtu Zhu, Yun Li, Jimeng Sun, ZongYuan Ge, Gang Li, James Zou, Huaxiu Yao
Artificial intelligence has significantly impacted medical applications, particularly with the advent of Medical Large Vision Language Models (Med-LVLMs), sparking optimism for the future of automated and personalized healthcare.
no code implementations • 5 Jun 2024 • Esma Mouine, Yan Liu, Lu Xiao, Rick Kazman, Xiao Wang
A fundamental but unresolved research question is: how do different factors in the mining and learning process impact the accuracy of identifying vulnerabilities in software projects of varying characteristics?
no code implementations • 4 Jun 2024 • Jing Yang, Xiao Wang, Yu Zhao, Yuhang Liu, Fei-Yue Wang
Therefore, we present a Prompt-Based Contrastive learning framework for TD (PBCT), which incorporates a prompt-based trigger detector to overcome dependence.
1 code implementation • 22 May 2024 • Angéline Pouget, Lucas Beyer, Emanuele Bugliarello, Xiao Wang, Andreas Peter Steiner, Xiaohua Zhai, Ibrahim Alabdulmohsin
Third, we introduce the task of geo-localization as a novel evaluation metric to assess cultural diversity in VLMs.
no code implementations • 16 May 2024 • Jing Yang, Xiao Wang, Yutong Wang, Jiawei Wang, Fei-Yue Wang
To achieve more accurate TKG reasoning, we propose an attention masking-based contrastive event network (AMCEN) with local-global temporal patterns for the two-stage prediction of future events.
no code implementations • 13 May 2024 • Lu Dong, Lipisha Chaudhary, Fei Xu, Xiao Wang, Mason Lary, Ifeoma Nwogu
Achieving expressive 3D motion reconstruction and automatic generation for isolated sign words can be challenging, due to the lack of real-world 3D sign-word data, the complex nuances of signing motions, and the cross-modal understanding of sign language semantics.
1 code implementation • 2 May 2024 • Yujie Xing, Xiao Wang, Yibo Li, Hai Huang, Chuan Shi
Then we propose a novel Bi-Level Global Graph Transformer with Collaborative Training (CoBFormer), including the inter-cluster and intra-cluster Transformers, to prevent the over-globalizing problem while keeping the ability to extract valuable information from distant nodes.
2 code implementations • 28 Apr 2024 • Ju Huang, Shiao Wang, Shuai Wang, Zhe Wu, Xiao Wang, Bo Jiang
Specifically, our Mamba-based tracker achieves 43. 5/55. 6 on the SR/PR metric, while the ViT-S based tracker (OSTrack) obtains 40. 0/50. 9.
1 code implementation • 27 Apr 2024 • Xiao Wang, Yuehang Li, Wentao Wu, Jiandong Jin, Yao Rong, Bo Jiang, Chuanfu Li, Jin Tang
Existing X-ray based pre-trained vision models are usually conducted on a relatively small-scale dataset (less than 500k samples) with limited resolution (e. g., 224 $\times$ 224).
3 code implementations • 27 Apr 2024 • Xiao Wang, Qian Zhu, Jiandong Jin, Jun Zhu, Futian Wang, Bo Jiang, YaoWei Wang, Yonghong Tian
Specifically, we formulate the video-based PAR as a vision-language fusion problem and adopt a pre-trained foundation model CLIP to extract the visual features.
1 code implementation • 27 Apr 2024 • Yijia Liu, Xiao Wang
The results of our experiments highlight the superiority of our proposed framework over existing methods, such as sparse variational Bayesian and generative models, in terms of prediction accuracy and uncertainty quantification.
no code implementations • 23 Apr 2024 • Xiao Wang, Siyan Liu, Aristeidis Tsaris, Jong-Youl Choi, Ashwin Aji, Ming Fan, Wei zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash
As the largest model of its kind, ORBIT surpasses the current climate AI foundation model size by a thousandfold.
no code implementations • 18 Apr 2024 • Xiao Wang, Ke Tang, Xingyuan Dai, Jintao Xu, Quancheng Du, Rui Ai, Yuxiao Wang, Weihao Gu
To effectively assess the risks prevailing in the vicinity of AVs in social interactive traffic scenarios and achieve safe autonomous driving, this article proposes a social-suitable and safety-sensitive trajectory planning (S4TP) framework.
no code implementations • 16 Apr 2024 • Xiao Wang, Tianze Chen, Xianjun Yang, Qi Zhang, Xun Zhao, Dahua Lin
The open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
1 code implementation • 15 Apr 2024 • Xiao Wang, Shiao Wang, Yuhe Ding, Yuehang Li, Wentao Wu, Yao Rong, Weizhe Kong, Ju Huang, Shihao Li, Haoxiang Yang, Ziwen Wang, Bo Jiang, Chenglong Li, YaoWei Wang, Yonghong Tian, Jin Tang
In this paper, we give the first comprehensive review of these works and also provide experimental comparisons and analysis to better demonstrate the features and advantages of SSM.
no code implementations • 15 Apr 2024 • Enzhi Zhang, Isaac Lyngaas, Peng Chen, Xiao Wang, Jun Igarashi, Yuankai Huo, Mohamed Wahib, Masaharu Munetomo
For high-resolution images, e. g. microscopic pathology images, the quadratic compute and memory cost prohibits the use of an attention-based model, if we are to use smaller patch sizes that are favorable in segmentation.
1 code implementation • 28 Mar 2024 • Bo Wan, Michael Tschannen, Yongqin Xian, Filip Pavetic, Ibrahim Alabdulmohsin, Xiao Wang, André Susano Pinto, Andreas Steiner, Lucas Beyer, Xiaohua Zhai
In this paper, we propose a simple visual pretraining method with location-aware captioners (LocCa).
no code implementations • 21 Mar 2024 • Yulan Hu, Sheng Ouyang, Zhirui Yang, Ge Chen, Junchen Wan, Xiao Wang, Yong liu
Specifically, GA^2E proposes to use the subgraph as the meta-structure, which remains consistent across all graph tasks (ranging from node-, edge-, and graph-level to transfer learning) and all stages (both during training and inference).
1 code implementation • 18 Mar 2024 • Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang
This paper introduces EasyJailbreak, a unified framework simplifying the construction and evaluation of jailbreak attacks against LLMs.
1 code implementation • 10 Mar 2024 • Lin Zhu, Xianzhang Chen, Xiao Wang, Hua Huang
Our framework exhibits a substantial margin of improvement in capturing and highlighting visual saliency in the spike stream, which not only provides a new perspective for spike-based saliency segmentation but also shows a new paradigm for full SNN-based transformer models.
4 code implementations • 9 Mar 2024 • Xiao Wang, Ju Huang, Shiao Wang, Chuanming Tang, Bo Jiang, Yonghong Tian, Jin Tang, Bin Luo
Current event-/frame-event based trackers undergo evaluation on short-term tracking datasets, however, the tracking of real-world scenarios involves long-term tracking, and the performance of existing tracking algorithms in these scenarios remains unclear.
no code implementations • 7 Mar 2024 • Ibrahim Alabdulmohsin, Xiao Wang, Andreas Steiner, Priya Goyal, Alexander D'Amour, Xiaohua Zhai
Interestingly, data and architectural improvements seem to mitigate the negative impact of data balancing on performance; e. g. applying M4 to SigLIP-B/16 with data quality filters improves COCO image-to-text retrieval @5 from 86% (without data balancing) to 87% and ImageNet 0-shot classification from 77% to 77. 5%!
no code implementations • 7 Mar 2024 • Yuling Wang, Changxin Tian, Binbin Hu, Yanhua Yu, Ziqi Liu, Zhiqiang Zhang, Jun Zhou, Liang Pang, Xiao Wang
We encode the generated rationales from the student model into a dense vector, which empowers recommendation in both ID-based and ID-agnostic scenarios.
no code implementations • 6 Mar 2024 • Yuling Wang, Xiao Wang, Xiangzhou Huang, Yanhua Yu, Haoyang Li, Mengdi Zhang, Zirui Guo, Wei Wu
The other is different behaviors have different intent distributions, so how to establish their relations for a more explainable recommender system.
1 code implementation • NeurIPS 2023 • Donglin Xia, Xiao Wang, Nian Liu, Chuan Shi
To address this challenge, we propose the Cluster Information Transfer (CIT) mechanism (Code available at https://github. com/BUPT-GAMMA/CITGNN), which can learn invariant representations for GNNs, thereby improving their generalization ability to various and unknown test graphs with structure shift.
no code implementations • 5 Mar 2024 • Yanbei Liu, Yu Zhao, Xiao Wang, Lei Geng, Zhitao Xiao
By an experimental analysis, we discover the semantic information of an augmented graph structure may be not consistent as original graph structure, and whether two augmented graphs are positive or negative pairs is highly related with the multi-scale structures.
no code implementations • 5 Mar 2024 • Mengmei Zhang, Xiao Wang, Chuan Shi, Lingjuan Lyu, Tianchi Yang, Junping Du
To break this dilemma, we propose a new type of topology attack, named minimum-budget topology attack, aiming to adaptively find the minimum perturbation sufficient for a successful attack on each node.
no code implementations • 1 Mar 2024 • Qiang Meng, Xiao Wang, Jiabao Wang, Liujiang Yan, Ke Wang
Our proposed Small, Versatile, and Mighty (SVM) network utilizes a pure convolutional architecture to fully unleash the efficiency and multi-tasking potentials of the range view representation.
1 code implementation • 26 Feb 2024 • Huijie Lv, Xiao Wang, Yuansen Zhang, Caishuang Huang, Shihan Dou, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang
Adversarial misuse, particularly through `jailbreaking' that circumvents a model's safety and ethical protocols, poses a significant challenge for Large Language Models (LLMs).
no code implementations • 26 Feb 2024 • Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, drawing inspiration from recent works that LLMs are sensitive to the design of the instructions, we utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions.
1 code implementation • 8 Feb 2024 • Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, wei he, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, we propose R$^3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL), a novel method that employs only outcome supervision to achieve the benefits of process supervision for large language models.
1 code implementation • 7 Feb 2024 • Hailiang Li, Yan Huo, Yan Wang, Xu Yang, Miaohui Hao, Xiao Wang
As the modern CPU, GPU, and NPU chip design complexity and transistor counts keep increasing, and with the relentless shrinking of semiconductor technology nodes to nearly 1 nanometer, the placement and routing have gradually become the two most pivotal processes in modern very-large-scale-integrated (VLSI) circuit back-end design.
1 code implementation • 3 Feb 2024 • Lixu Wang, Yang Zhao, Jiahua Dong, Ating Yin, Qinbin Li, Xiao Wang, Dusit Niyato, Qi Zhu
Federated Learning (FL) is a privacy-preserving distributed learning approach that is rapidly developing in an era where privacy protection is increasingly valued.
1 code implementation • 2 Feb 2024 • Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui
The advancement of large language models (LLMs) has significantly propelled the field of code generation.
no code implementations • 31 Jan 2024 • Chenyu Shi, Xiao Wang, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Xun Zhao, Dahua Lin
Large language models are meticulously aligned to be both helpful and harmless.
no code implementations • 30 Jan 2024 • Yibo Li, Xiao Wang, Yujie Xing, Shaohua Fan, Ruijia Wang, Yaoqi Liu, Chuan Shi
Recently, there has been an increasing interest in ensuring fairness on GNNs, but all of them are under the assumption that the training and testing data are under the same distribution, i. e., training data and testing data are from the same graph.
no code implementations • 30 Jan 2024 • Linyao Yang, Hongyang Chen, Xiao Wang, Jing Yang, Fei-Yue Wang, Han Liu
The final prediction of the equivalent entity is derived from the LLM's output.
1 code implementation • 23 Jan 2024 • Yanhu Mo, Xiao Wang, Shaohua Fan, Chuan Shi
How can we fix it and encourage the current GCL to learn better invariant representations?
1 code implementation • 21 Jan 2024 • Songyang Gao, Qiming Ge, Wei Shen, Shihan Dou, Junjie Ye, Xiao Wang, Rui Zheng, Yicheng Zou, Zhi Chen, Hang Yan, Qi Zhang, Dahua Lin
This reliance limits the applicability of RLHF and hinders the development of professional assistants tailored to diverse human preferences.
1 code implementation • 20 Jan 2024 • Haoxiang Yang, Chengguo Yuan, Yabin Zhu, Lan Chen, Xiao Wang, Futian Wang
The mainstream human activity recognition (HAR) algorithms are developed based on RGB cameras, which are easily influenced by low-quality images (e. g., low illumination, motion blur).
1 code implementation • 11 Jan 2024 • Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data.
2 code implementations • 10 Jan 2024 • Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao liu, Heng Ji, Hongyi Wang, huan zhang, Huaxiu Yao, Manolis Kellis, Marinka Zitnik, Meng Jiang, Mohit Bansal, James Zou, Jian Pei, Jian Liu, Jianfeng Gao, Jiawei Han, Jieyu Zhao, Jiliang Tang, Jindong Wang, Joaquin Vanschoren, John Mitchell, Kai Shu, Kaidi Xu, Kai-Wei Chang, Lifang He, Lifu Huang, Michael Backes, Neil Zhenqiang Gong, Philip S. Yu, Pin-Yu Chen, Quanquan Gu, ran Xu, Rex Ying, Shuiwang Ji, Suman Jana, Tianlong Chen, Tianming Liu, Tianyi Zhou, William Wang, Xiang Li, Xiangliang Zhang, Xiao Wang, Xing Xie, Xun Chen, Xuyu Wang, Yan Liu, Yanfang Ye, Yinzhi Cao, Yong Chen, Yue Zhao
This paper introduces TrustLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions.
1 code implementation • 5 Jan 2024 • Yabin Zhu, Xiao Wang, Chenglong Li, Bo Jiang, Lin Zhu, Zhixiang Huang, Yonghong Tian, Jin Tang
In this work, we formally propose the task of object tracking using unaligned neuromorphic and visible cameras.
no code implementations • CVPR 2024 • Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut
We explore the boundaries of scaling up a multilingual vision and language model both in terms of size of the components and the breadth of its training task mixture.
1 code implementation • 21 Dec 2023 • Yingzhou Lu, Minjie Shen, Ling Yue, Chenhao Li, Lulu Chen, Fan Meng, Xiao Wang, David Herrington, Yue Wang, Yue Zhao, Tianfan Fu, Capucine van Rechem
With GenoCraft, researchers and data scientists have access to an array of cutting-edge bioinformatics tools under a user-friendly interface, making it a valuable resource for managing and analyzing large-scale omics data.
no code implementations • 21 Dec 2023 • Lixu Wang, Chenxi Liu, Junfeng Guo, Jiahua Dong, Xiao Wang, Heng Huang, Qi Zhu
In a privacy-focused era, Federated Learning (FL) has emerged as a promising machine learning technique.
no code implementations • 20 Dec 2023 • Sajal Dash, Isaac Lyngaas, Junqi Yin, Xiao Wang, Romain Egele, Guojing Cong, Feiyi Wang, Prasanna Balaprakash
For the training of the 175 Billion parameter model and the 1 Trillion parameter model, we achieved $100\%$ weak scaling efficiency on 1024 and 3072 MI250X GPUs, respectively.
1 code implementation • 18 Dec 2023 • Xiao Wang, Yao Rong, Shiao Wang, Yuan Chen, Zhe Wu, Bo Jiang, Yonghong Tian, Jin Tang
It is intuitive to combine them for high-performance RGB-Event based video recognition, however, existing works fail to achieve a good balance between the accuracy and model parameters, as shown in Fig.~\ref{firstimage}.
2 code implementations • 17 Dec 2023 • Xiao Wang, Jiandong Jin, Chenglong Li, Jin Tang, Cheng Zhang, Wei Wang
In this paper, we formulate PAR as a vision-language fusion problem and fully exploit the relations between pedestrian images and attribute labels.
1 code implementation • 15 Dec 2023 • Xiao Wang, Wentao Wu, Chenglong Li, Zhicheng Zhao, Zhe Chen, Yukai Shi, Jin Tang
To address this issue, we propose a novel vehicle-centric pre-training framework called VehicleMAE, which incorporates the structural information including the spatial structure from vehicle profile information and the semantic structure from informative high-level natural language descriptions for effective masked vehicle appearance reconstruction.
1 code implementation • 15 Dec 2023 • Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, ShiLiang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling them to align with human instructions and enhance their capabilities in downstream tasks.
no code implementations • 14 Dec 2023 • Yibo Li, Xiao Wang, Hongrui Liu, Chuan Shi
In this paper, we propose a general diffusion equation framework with the fidelity term, which formally establishes the relationship between the diffusion process with more GNNs.
2 code implementations • 4 Dec 2023 • Jiandong Jin, Xiao Wang, Chenglong Li, Lili Huang, Jin Tang
Then, a Transformer decoder is proposed to generate the human attributes by incorporating the visual features and attribute query tokens.
2 code implementations • 1 Dec 2023 • Xiao Wang, Yaoyu Li, Tian Gan, Zheng Zhang, Jingjing Lv, Liqiang Nie
Recent advancements in video-language understanding have been established on the foundation of image-text models, resulting in promising outcomes due to the shared knowledge between images and videos.
Ranked #9 on
Video Captioning
on MSR-VTT
(using extra training data)
1 code implementation • 30 Nov 2023 • Dong Li, Jiandong Jin, Yuhao Zhang, Yanlin Zhong, Yaoyang Wu, Lan Chen, Xiao Wang, Bin Luo
Current methods typically employ backbone networks to individually extract the features of RGB frames and event streams, and subsequently fuse these features for pattern recognition.
no code implementations • 4 Nov 2023 • Xiao Wang, Isaac Lyngaas, Aristeidis Tsaris, Peng Chen, Sajal Dash, Mayanka Chandra Shekar, Tao Luo, Hong-Jun Yoon, Mohamed Wahib, John Gouley
This paper presents a novel and efficient distributed training method, the Long Short-Sequence Transformer (LSS Transformer), for training transformer with long sequences.
1 code implementation • 22 Oct 2023 • Xiao Wang, Tianze Chen, Qiming Ge, Han Xia, Rong Bao, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang
In this paper, we propose orthogonal low-rank adaptation (O-LoRA), a simple and efficient approach for continual learning in language models, effectively mitigating catastrophic forgetting while learning new tasks.
no code implementations • 18 Oct 2023 • Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang
In this work, we propose a novel approach that can learn a consistent policy via RL across various data groups or domains.
1 code implementation • 17 Oct 2023 • Bo Jiang, Zitian Wang, Xixi Wang, Ziyan Zhang, Lan Chen, Xiao Wang, Bin Luo
Then, each pixel of feature map is regarded as a graph node and the graph neural network is proposed to model the structured information for coarse change map prediction.
1 code implementation • 13 Oct 2023 • Xi Chen, Xiao Wang, Lucas Beyer, Alexander Kolesnikov, Jialin Wu, Paul Voigtlaender, Basil Mustafa, Sebastian Goodman, Ibrahim Alabdulmohsin, Piotr Padlewski, Daniel Salz, Xi Xiong, Daniel Vlasic, Filip Pavetic, Keran Rong, Tianli Yu, Daniel Keysers, Xiaohua Zhai, Radu Soricut
This paper presents PaLI-3, a smaller, faster, and stronger vision language model (VLM) that compares favorably to similar models that are 10x larger.
Ranked #2 on
Temporal/Casual QA
on NExT-QA
(using extra training data)
1 code implementation • 10 Oct 2023 • Xiao Wang, Yuansen Zhang, Tianze Chen, Songyang Gao, Senjie Jin, Xianjun Yang, Zhiheng Xi, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, we introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
no code implementations • 6 Oct 2023 • Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri, Rao Kotamarthi, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Troy Arcomano, Romit Maulik, Maxim Zvyagin, Alexander Brace, Bin Zhang, Cindy Orozco Bohorquez, Austin Clyde, Bharat Kale, Danilo Perez-Rivera, Heng Ma, Carla M. Mann, Michael Irvin, J. Gregory Pauloski, Logan Ward, Valerie Hayot, Murali Emani, Zhen Xie, Diangen Lin, Maulik Shukla, Ian Foster, James J. Davis, Michael E. Papka, Thomas Brettin, Prasanna Balaprakash, Gina Tourassi, John Gounley, Heidi Hanson, Thomas E Potok, Massimiliano Lupo Pasini, Kate Evans, Dan Lu, Dalton Lunga, Junqi Yin, Sajal Dash, Feiyi Wang, Mallikarjun Shankar, Isaac Lyngaas, Xiao Wang, Guojing Cong, Pei Zhang, Ming Fan, Siyan Liu, Adolfy Hoisie, Shinjae Yoo, Yihui Ren, William Tang, Kyle Felker, Alexey Svyatkovskiy, Hang Liu, Ashwin Aji, Angela Dalton, Michael Schulte, Karl Schulz, Yuntian Deng, Weili Nie, Josh Romero, Christian Dallago, Arash Vahdat, Chaowei Xiao, Thomas Gibbs, Anima Anandkumar, Rick Stevens
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences.
no code implementations • 4 Oct 2023 • Xianjun Yang, Xiao Wang, Qi Zhang, Linda Petzold, William Yang Wang, Xun Zhao, Dahua Lin
This study serves as a clarion call for a collective effort to overhaul and fortify the safety of open-source LLMs against malicious attackers.
4 code implementations • CVPR 2024 • Xiao Wang, Shiao Wang, Chuanming Tang, Lin Zhu, Bo Jiang, Yonghong Tian, Jin Tang
Tracking using bio-inspired event cameras has drawn more and more attention in recent years.
1 code implementation • NeurIPS 2023 • Yue Yu, Xiao Wang, Mengmei Zhang, Nian Liu, Chuan Shi
To this end, we propose the PrOvable Training (POT) for GCL, which regularizes the training of GCL to encode node embeddings that follows the GCL principle better.
1 code implementation • 14 Sep 2023 • Zhiheng Xi, Wenxiang Chen, Xin Guo, wei he, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, Tao Gui
Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks.
no code implementations • 31 Aug 2023 • Xiao Wang, Fang Dai, Wenyan Guo, Junfeng Wang
Therefore, a stochastic block model that integrates betweenness centrality and clustering coefficient of nodes for community detection in attributed networks, named BCSBM, is proposed in this paper.
no code implementations • 27 Aug 2023 • Xiujun Shu, Wei Wen, Liangsheng Xu, Ruizhi Qiao, Taian Guo, Hanjun Li, Bei Gan, Xiao Wang, Xing Sun
In this paper, we present a unified and dynamic graph (UniDG) framework for temporal character grouping.
1 code implementation • 23 Aug 2023 • Chengguo Yuan, Yu Jin, Zongzhen Wu, Fanting Wei, Yangzirui Wang, Lan Chen, Xiao Wang
Additionally, a bottleneck Transformer is introduced to facilitate the fusion of the dual-stream information.
1 code implementation • 14 Aug 2023 • Tian Gan, Xiao Wang, Yan Sun, Jianlong Wu, Qingpei Guo, Liqiang Nie
The goal of TSGSV is to evaluate the relevance between a video stream and a given sentence query.
1 code implementation • 10 Aug 2023 • Haoju Leng, Ruining Deng, Shunxing Bao, Dazheng Fang, Bryan A. Millis, Yucheng Tang, Haichun Yang, Xiao Wang, Yifan Peng, Lipeng Wan, Yuankai Huo
The performance evaluation encompasses two key scenarios: (1) a pure CPU-based image analysis scenario ("CPU scenario"), and (2) a GPU-based deep learning framework scenario ("GPU scenario").
1 code implementation • 8 Aug 2023 • Xiao Wang, Yao Rong, Zongzhen Wu, Lin Zhu, Bo Jiang, Jin Tang, Yonghong Tian
Secondly, they adopt either Spiking Neural Networks (SNN) for energy-efficient recognition with suboptimal results, or Artificial Neural Networks (ANN) for energy-intensive, high-performance recognition.
no code implementations • 1 Aug 2023 • Xiao Wang, Sean MacAvaney, Craig Macdonald, Iadh Ounis
GenQR directly reformulates the user's input query, while GenPRF provides additional context for the query by making use of pseudo-relevance feedback information.
1 code implementation • 27 Jun 2023 • Songyang Gao, Shihan Dou, Yan Liu, Xiao Wang, Qi Zhang, Zhongyu Wei, Jin Ma, Ying Shan
Adversarial training is one of the best-performing methods in improving the robustness of deep language models.
1 code implementation • 13 Jun 2023 • Yizhen Zheng, He Zhang, Vincent CS Lee, Yu Zheng, Xiao Wang, Shirui Pan
Real-world graphs generally have only one kind of tendency in their connections.
1 code implementation • 8 Jun 2023 • Bo Jiang, Chengguo Yuan, Xiao Wang, Zhimin Bao, Lin Zhu, Yonghong Tian, Jin Tang
To address these issues, we propose a novel dual point-voxel absorbing graph representation learning for event stream data representation.
no code implementations • 30 May 2023 • Bo Jiang, Shuxian Luo, Xiao Wang, Chuanfu Li, Jin Tang
Second, AMatFormer adopts a shared FFN module to further embed the features of two images into the common domain and thus learn the consensus feature representations for the matching problem.
2 code implementations • 29 May 2023 • Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut
We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture.
Ranked #1 on
Fine-Grained Image Recognition
on OVEN
1 code implementation • NeurIPS 2023 • Jannik Kossen, Mark Collier, Basil Mustafa, Xiao Wang, Xiaohua Zhai, Lucas Beyer, Andreas Steiner, Jesse Berent, Rodolphe Jenatton, Efi Kokiopoulou
With 3T, we propose a more flexible strategy that allows the image tower to benefit from both pretrained embeddings and contrastive training.
1 code implementation • 22 May 2023 • Xiao Wang, Weikang Zhou, Qi Zhang, Jie zhou, Songyang Gao, Junzhe Wang, Menghan Zhang, Xiang Gao, Yunwen Chen, Tao Gui
Pretrained language models have achieved remarkable success in various natural language processing tasks.
1 code implementation • 21 May 2023 • Limao Xiong, Jie zhou, Qunxi Zhu, Xiao Wang, Yuanbin Wu, Qi Zhang, Tao Gui, Xuanjing Huang, Jin Ma, Ying Shan
Particularly, we propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER.
no code implementations • 8 May 2023 • Yupei Lin, Sen Zhang, Xiaojun Yang, Xiao Wang, Yukai Shi
To ensure consistent preservation of the shape during image editing, we propose cross-attention guidance based on regeneration learning.
no code implementations • 24 Apr 2023 • Nian Liu, Xiao Wang, Hui Han, Chuan Shi
Specifically, two views of a HIN (network schema and meta-path views) are proposed to learn node embeddings, so as to capture both of local and high-order structures simultaneously.
1 code implementation • 20 Apr 2023 • Jun Zhu, Jiandong Jin, Zihan Yang, Xiaohao Wu, Xiao Wang
The averaged visual tokens and text tokens are concatenated and fed into a fusion Transformer for multi-modal interactive learning.
1 code implementation • 17 Apr 2023 • Xiao Wang, Weikang Zhou, Can Zu, Han Xia, Tianze Chen, Yuansen Zhang, Rui Zheng, Junjie Ye, Qi Zhang, Tao Gui, Jihua Kang, Jingsheng Yang, Siyuan Li, Chunsai Du
Large language models have unlocked strong multi-task capabilities from reading instructive prompts.
Ranked #3 on
Zero-shot Named Entity Recognition (NER)
on CrossNER
(using extra training data)
1 code implementation • CVPR 2023 • Ziyue Zhu, Qiang Meng, Xiao Wang, Ke Wang, Liujiang Yan, Jian Yang
For the loss design, we propose the COMLoss to dynamically predict object-level difficulties and emphasize objects of different difficulties based on training stages.
1 code implementation • 8 Apr 2023 • Yixuan Qiu, Xiao Wang
Sampling from high-dimensional distributions is a fundamental problem in statistical research and practice.
1 code implementation • 30 Mar 2023 • Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, Xiaohua Zhai
A key finding is that a small decoder learned on top of a frozen pretrained encoder works surprisingly well.
no code implementations • 26 Mar 2023 • Yabin Zhu, Chenglong Li, Xiao Wang, Jin Tang, Zhixiang Huang
In addition, existing learning methods of RGBT trackers either fuse multimodal features into one for final classification, or exploit the relationship between unimodal branches and fused branch through a competitive learning strategy.
1 code implementation • 15 Mar 2023 • Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie
Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.
1 code implementation • 14 Mar 2023 • Xiao Wang, Ying Wang, Ziwei Xuan, Guo-Jun Qi
A criterion in unsupervised pretraining is the pretext task needs to be sufficiently hard to prevent the transformer encoder from learning trivial low-level features not generalizable well to downstream tasks.
1 code implementation • 20 Feb 2023 • Xiao Wang, Guangyao Chen, Guangwu Qian, Pengcheng Gao, Xiao-Yong Wei, YaoWei Wang, Yonghong Tian, Wen Gao
We also give visualization and analysis of the model parameters and results on representative downstream tasks.
no code implementations • 11 Feb 2023 • Deyu Bo, Xiao Wang, Yang Liu, Yuan Fang, Yawen Li, Chuan Shi
Graph neural networks (GNNs) have attracted considerable attention from the research community.
1 code implementation • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby
The scaling of Transformers has driven breakthrough capabilities for language models.
Ranked #1 on
Zero-Shot Transfer Image Classification
on ObjectNet
no code implementations • 8 Feb 2023 • Yingzhou Lu, Minjie Shen, Huazheng Wang, Xiao Wang, Capucine van Rechem, Tianfan Fu, Wenqi Wei
In light of these challenges, the concept of synthetic data generation emerges as a promising alternative that allows for data sharing and utilization in ways that real-world data cannot facilitate.
2 code implementations • 25 Jan 2023 • Chenxi Liu, Lixu Wang, Lingjuan Lyu, Chen Sun, Xiao Wang, Qi Zhu
To overcome these limitations of DA and DG in handling the Unfamiliar Period during continual domain shift, we propose RaTP, a framework that focuses on improving models' target domain generalization (TDG) capability, while also achieving effective target domain adaptation (TDA) capability right after training on certain domains and forgetting alleviation (FA) capability on past domains.
1 code implementation • 30 Nov 2022 • Shaohua Fan, Shuyang Zhang, Xiao Wang, Chuan Shi
In a dynamic graph, we propose to simultaneously estimate contemporaneous relationships and time-lagged interaction relationships between the node features.
no code implementations • 23 Nov 2022 • Adam Dziedzic, Christopher A Choquette-Choo, Natalie Dullerud, Vinith Menon Suriyakumar, Ali Shahin Shamsabadi, Muhammad Ahmad Kaleem, Somesh Jha, Nicolas Papernot, Xiao Wang
We use our mechanisms to enable privacy-preserving multi-label learning in the central setting by extending the canonical single-label technique: PATE.
2 code implementations • 20 Nov 2022 • Chuanming Tang, Xiao Wang, Ju Huang, Bo Jiang, Lin Zhu, Jianlin Zhang, YaoWei Wang, Yonghong Tian
In this paper, we propose a single-stage backbone network for Color-Event Unified Tracking (CEUTrack), which achieves the above functions simultaneously.
Ranked #3 on
Object Tracking
on COESOT
no code implementations • 19 Nov 2022 • Xixi Wang, Bo Jiang, Xiao Wang, Bin Luo
(1) It employs a flexible graph model, termed Batch Graph to jointly encode the visual and semantic relationships of samples within each mini-batch.
3 code implementations • 17 Nov 2022 • Xiao Wang, Zongzhen Wu, Bo Jiang, Zhimin Bao, Lin Zhu, Guoqi Li, YaoWei Wang, Yonghong Tian
The main streams of human activity recognition (HAR) algorithms are developed based on RGB cameras which are suffered from illumination, fast motion, privacy-preserving, and large energy consumption.
no code implementations • 19 Oct 2022 • Niklas Kochdumper, Hanna Krasowski, Xiao Wang, Stanley Bak, Matthias Althoff
While reinforcement learning produces very promising results for many applications, its main disadvantage is the lack of safety guarantees, which prevents its use in safety-critical systems.
1 code implementation • 6 Oct 2022 • Ruijia Wang, Xiao Wang, Chuan Shi, Le Song
Recent studies show that graph convolutional network (GCN) often performs worse for low-degree nodes, exhibiting the so-called structural unfairness for graphs with long-tailed degree distributions prevalent in the real world.
1 code implementation • 5 Oct 2022 • Nian Liu, Xiao Wang, Deyu Bo, Chuan Shi, Jian Pei
Then we theoretically prove that GCL is able to learn the invariance information by contrastive invariance theorem, together with our GAME rule, for the first time, we uncover that the learned representations by GCL essentially encode the low-frequency information, which explains why GCL works.
1 code implementation • 28 Sep 2022 • Shaohua Fan, Xiao Wang, Yanhu Mo, Chuan Shi, Jian Tang
However, by presenting a graph classification investigation on the training graphs with severe bias, surprisingly, we discover that GNNs always tend to explore the spurious correlations to make decision, even if the causal correlation always exists.
1 code implementation • 14 Sep 2022 • Xi Chen, Xiao Wang, Soravit Changpinyo, AJ Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish Thapliyal, James Bradbury, Weicheng Kuo, Mojtaba Seyedhosseini, Chao Jia, Burcu Karagol Ayan, Carlos Riquelme, Andreas Steiner, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut
PaLI generates text based on visual and textual inputs, and with this interface performs many vision, language, and multimodal tasks, in many languages.
no code implementations • 26 Aug 2022 • Xixi Wang, Xiao Wang, Bo Jiang, Bin Luo
sampleFormer aims to capture the dependence of samples in support and query sets for image representation.
1 code implementation • 18 Aug 2022 • Xiujun Shu, Wei Wen, Haoqian Wu, Keyu Chen, Yiran Song, Ruizhi Qiao, Bo Ren, Xiao Wang
To explore the fine-grained alignment, we further propose two implicit semantic alignment paradigms: multi-level alignment (MLA) and bidirectional mask modeling (BMM).