1 code implementation • 15 Apr 2025 • Xiang Wang, Shiwei Zhang, Longxiang Tang, Yingya Zhang, Changxin Gao, Yuehuan Wang, Nong Sang
Furthermore, we adopt a simple concatenation operation to integrate the reference appearance into the model and incorporate the pose information of the reference image for enhanced pose alignment.
no code implementations • 15 Apr 2025 • Minghui Lin, Shu Wang, Xiang Wang, Jianhua Tang, Longbin Fu, Zhengrong Zuo, Nong Sang
Current multi-modal object re-identification approaches based on large-scale pre-trained backbones (i. e., ViT) have displayed remarkable progress and achieved excellent performance.
no code implementations • 15 Apr 2025 • Xiang Wang, Shiwei Zhang, Hangjie Yuan, Yujie Wei, Yingya Zhang, Changxin Gao, Yuehuan Wang, Nong Sang
Recent advancements in human image animation have been propelled by video diffusion models, yet their reliance on numerous iterative denoising steps results in high inference costs and slow speeds.
no code implementations • 9 Apr 2025 • Junfeng Fang, Yukai Wang, Ruipeng Wang, Zijun Yao, Kun Wang, An Zhang, Xiang Wang, Tat-Seng Chua
The rapid advancement of multi-modal large reasoning models (MLRMs) -- enhanced versions of multimodal language models (MLLMs) equipped with reasoning capabilities -- has revolutionized diverse applications.
1 code implementation • 27 Mar 2025 • Minghui Lin, Xiang Wang, Yishan Wang, Shu Wang, Fengqi Dai, Pengxiang Ding, Cunxiang Wang, Zhengrong Zuo, Nong Sang, Siteng Huang, Donglin Wang
Recent advancements in video generation have witnessed significant progress, especially with the rapid advancement of diffusion models.
no code implementations • 19 Mar 2025 • Yanchen Luo, Zhiyuan Liu, Yi Zhao, Sihang Li, Kenji Kawaguchi, Tat-Seng Chua, Xiang Wang
In this work, we propose \textbf{U}nified Variational \textbf{A}uto-\textbf{E}ncoder for \textbf{3D} Molecular Latent Diffusion Modeling (\textbf{UAE-3D}), a multi-modal VAE that compresses 3D molecules into latent sequences from a unified latent space, while maintaining near-zero reconstruction error.
1 code implementation • 12 Mar 2025 • Yaorui Shi, Jiaqi Yang, Sihang Li, Junfeng Fang, Xiang Wang, Zhiyuan Liu, Yang Zhang
Pre-trained language models (PLMs) have revolutionized scientific research, yet their application to single-cell analysis remains limited.
1 code implementation • 11 Mar 2025 • Wei Shi, Sihang Li, Tao Liang, Mingyang Wan, Guojun Ma, Xiang Wang, Xiangnan He
In this paper, we introduce Route Sparse Autoencoder (RouteSAE), a new framework that integrates a routing mechanism with a shared SAE to efficiently extract features from multiple layers.
no code implementations • 10 Mar 2025 • Yujie Wei, Shiwei Zhang, Hangjie Yuan, Biao Gong, Longxiang Tang, Xiang Wang, Haonan Qiu, Hengjia Li, Shuai Tan, Yingya Zhang, Hongming Shan
First, in Relational Decoupling Learning, we disentangle relations from subject appearances using relation LoRA triplet and hybrid mask training strategy, ensuring better generalization across diverse relationships.
1 code implementation • 10 Mar 2025 • Junkang Wu, Kexin Huang, Xue Wang, Jinyang Gao, Bolin Ding, Jiancan Wu, Xiangnan He, Xiang Wang
Aligning large language models (LLMs) with human preferences is critical for real-world deployment, yet existing methods like RLHF face computational and stability challenges.
no code implementations • 6 Mar 2025 • Chengpeng Li, Mingfeng Xue, Zhenru Zhang, Jiaxi Yang, Beichen Zhang, Xiang Wang, Bowen Yu, Binyuan Hui, Junyang Lin, Dayiheng Liu
In this paper, we introduce START (Self-Taught Reasoner with Tools), a novel tool-integrated long CoT reasoning LLM that significantly enhances reasoning capabilities by leveraging external tools.
1 code implementation • 18 Feb 2025 • Zhiyuan Liu, Yanchen Luo, Han Huang, Enzhi Zhang, Sihang Li, Junfeng Fang, Yaorui Shi, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua
To combine these advantages for 3D molecule generation, we propose a foundation model -- NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation.
1 code implementation • 9 Feb 2025 • Zherui Li, Houcheng Jiang, Hao Chen, Baolong Bi, Zhenhong Zhou, Fei Sun, Junfeng Fang, Xiang Wang
Large language models (LLMs) acquire information from pre-training corpora, but their stored knowledge can become inaccurate or outdated over time.
no code implementations • 8 Feb 2025 • Houcheng Jiang, Junfeng Fang, Ningyu Zhang, Guojun Ma, Mingyang Wan, Xiang Wang, Xiangnan He, Tat-Seng Chua
Additionally, AnyEdit serves as a plug-and-play framework, enabling current editing methods to update knowledge with arbitrary length and format, significantly advancing the scope and practicality of LLM knowledge editing.
1 code implementation • 6 Feb 2025 • Guibin Zhang, Luyang Niu, Junfeng Fang, Kun Wang, Lei Bai, Xiang Wang
Large Language Model (LLM)-empowered multi-agent systems extend the cognitive boundaries of individual agents through disciplined collaboration and interaction, while constructing these systems often requires labor-intensive manual designs.
no code implementations • 4 Feb 2025 • Jinda Lu, Junkang Wu, Jinghan Li, Xiaojun Jia, Shuo Wang, Yifan Zhang, Junfeng Fang, Xiang Wang, Xiangnan He
Direct Preference Optimization (DPO) has shown effectiveness in aligning multi-modal large language models (MLLM) with human preferences.
no code implementations • 1 Feb 2025 • Chenlu Ding, Jiancan Wu, Yancheng Yuan, Junfeng Fang, Cunchun Li, Xiang Wang, Xiangnan He
In the realm of online digital advertising, conversion rate (CVR) prediction plays a pivotal role in maximizing revenue under cost-per-conversion (CPA) models, where advertisers are charged only when users complete specific actions, such as making a purchase.
no code implementations • 25 Jan 2025 • Jiayi Liao, Ruobing Xie, Sihang Li, Xiang Wang, Xingwu Sun, Zhanhui Kang, Xiangnan He
The framework consists of two stages: (1) Patch Pre-training, which familiarizes LLMs with item-level compression patterns, and (2) Patch Fine-tuning, which teaches LLMs to model sequences at multiple granularities.
1 code implementation • 22 Jan 2025 • Yongduo Sui, Jie Sun, Shuyao Wang, Zemin Liu, Qing Cui, Longfei Li, Xiang Wang
It provides a unified perspective on invariant graph learning, emphasizing both structural and semantic invariance principles to identify more robust stable features.
no code implementations • 15 Jan 2025 • YuAn Wang, Bin Zhu, Yanbin Hao, Chong-Wah Ngo, Yi Tan, Xiang Wang
These prompts encompass text prompts (representing cooking steps), image prompts (corresponding to cooking images), and multi-modal prompts (mixing cooking steps and images), ensuring the consistent generation of cooking procedural images.
no code implementations • 12 Jan 2025 • Zheng Zhang, Yihuai Lan, Yangsen Chen, Lei Wang, Xiang Wang, Hao Wang
This control not only ensures that NPCs can adapt to varying difficulty levels during gameplay, but also provides insights into the safety and fairness of LLM agents.
no code implementations • 25 Dec 2024 • Jiajia Chen, Jiancan Wu, Jiawei Chen, Chongming Gao, Yong Li, Xiang Wang
Collaborative recommendation fundamentally involves learning high-quality user and item representations from interaction data.
no code implementations • 12 Dec 2024 • Haonan Qiu, Shiwei Zhang, Yujie Wei, Ruihang Chu, Hangjie Yuan, Xiang Wang, Yingya Zhang, Ziwei Liu
Visual diffusion models achieve remarkable progress, yet they are typically trained at limited resolutions due to the lack of high-resolution data and constrained computation resources, hampering their ability to generate high-fidelity images or videos at higher resolutions.
no code implementations • 10 Dec 2024 • Yaorui Shi, Sihang Li, Taiyan Zhang, Xi Fang, Jiankun Wang, Zhiyuan Liu, Guojiang Zhao, Zhengdan Zhu, Zhifeng Gao, Renxin Zhong, Linfeng Zhang, Guolin Ke, Weinan E, Hengxing Cai, Xiang Wang
Automated drug discovery offers significant potential for accelerating the development of novel therapeutics by substituting labor-intensive human workflows with machine-driven processes.
1 code implementation • 9 Dec 2024 • YuAn Wang, Ouxiang Li, Tingting Mu, Yanbin Hao, Kuien Liu, Xiang Wang, Xiangnan He
Recent success of text-to-image (T2I) generation and its increasing practical applications, enabled by diffusion models, require urgent consideration of erasing unwanted concepts, e. g., copyrighted, offensive, and unsafe ones, from the pre-trained models in a precise, timely, and low-cost manner.
1 code implementation • 9 Dec 2024 • Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Xiaonan Huang, Changxin Gao, Shanjun Zhang, Li Yu, Nong Sang
For efficient anomaly detection in long videos, we propose the Anomaly-focused Temporal Sampler (ATS).
no code implementations • 30 Nov 2024 • Chenlu Ding, Jiancan Wu, Yancheng Yuan, Jinda Lu, Kai Zhang, Alex Su, Xiang Wang, Xiangnan He
The advent of Large Language Models (LLMs) has revolutionized natural language processing, enabling advanced understanding and reasoning capabilities across a variety of tasks.
no code implementations • 26 Nov 2024 • Hengjia Li, Haonan Qiu, Shiwei Zhang, Xiang Wang, Yujie Wei, Zekun Li, Yingya Zhang, Boxi Wu, Deng Cai
The key challenge lies in maintaining high ID fidelity consistently while preserving the original motion dynamic and semantic following after the identity injection.
no code implementations • 22 Nov 2024 • Liangrui Pan, Qingchun Liang, Wenwu Zeng, Yijun Peng, Zhenyu Zhao, Yiyi Liang, Jiadi Luo, Xiang Wang, Shaoliang Peng
Spread through air spaces (STAS) is a distinct invasion pattern in lung cancer, crucial for prognosis assessment and guiding surgical decisions.
no code implementations • 17 Oct 2024 • Yujie Wei, Shiwei Zhang, Hangjie Yuan, Xiang Wang, Haonan Qiu, Rui Zhao, Yutong Feng, Feng Liu, Zhizhong Huang, Jiaxin Ye, Yingya Zhang, Hongming Shan
In this paper, we present DreamVideo-2, a zero-shot video customization framework capable of generating videos with a specific subject and motion trajectory, guided by a single image and a bounding box sequence, respectively, and without the need for test-time fine-tuning.
no code implementations • 17 Oct 2024 • Guoqing Hu, Zhengyi Yang, Zhibo Cai, An Zhang, Xiang Wang
Recent advancements in generative recommendation systems, particularly in the realm of sequential recommendation tasks, have shown promise in enhancing generalization to new items.
no code implementations • 16 Oct 2024 • Jiayi Liao, Xiangnan He, Ruobing Xie, Jiancan Wu, Yancheng Yuan, Xingwu Sun, Zhanhui Kang, Xiang Wang
Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for recommendation systems, which usually adapt a pre-trained LLM to the recommendation scenario through supervised fine-tuning (SFT).
no code implementations • 14 Oct 2024 • Shuai Tan, Biao Gong, Xiang Wang, Shiwei Zhang, Dandan Zheng, Ruobing Zheng, Kecheng Zheng, Jingdong Chen, Ming Yang
Our in-depth analysis suggests to attribute this limitation to their insufficient modeling of motion, which is unable to comprehend the movement pattern of the driving video, thus imposing a pose sequence rigidly onto the target character.
1 code implementation • 14 Oct 2024 • Junkang Wu, Xue Wang, Zhengyi Yang, Jiancan Wu, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He
Aligning large language models (LLMs) with human values and intentions is crucial for their utility, honesty, and safety.
1 code implementation • 9 Oct 2024 • Jinghan Li, Yuan Gao, Jinda Lu, Junfeng Fang, Congcong Wen, Hui Lin, Xiang Wang
Graph Anomaly Detection (GAD) is crucial for identifying abnormal entities within networks, garnering significant attention across various fields.
1 code implementation • 9 Oct 2024 • Rui Zhao, Hangjie Yuan, Yujie Wei, Shiwei Zhang, YuChao Gu, Lingmin Ran, Xiang Wang, Zhangjie Wu, Junhao Zhang, Yingya Zhang, Mike Zheng Shou
Our experiments with extensive data indicate that the model trained on generated data of the advanced model can approximate its generation capability.
2 code implementations • 5 Oct 2024 • Houcheng Jiang, Junfeng Fang, Tianyu Zhang, An Zhang, Ruipeng Wang, Tao Liang, Xiang Wang
This work explores sequential model editing in large language models (LLMs), a critical task that involves modifying internal knowledge within LLMs continuously through multi-round editing, each incorporating updates or corrections to adjust the model outputs without the need for costly retraining.
no code implementations • 4 Oct 2024 • Yanchen Luo, Junfeng Fang, Sihang Li, Zhiyuan Liu, Jiancan Wu, An Zhang, Wenjie Du, Xiang Wang
The de novo generation of molecules with targeted properties is crucial in biology, chemistry, and drug discovery.
2 code implementations • 3 Oct 2024 • Junfeng Fang, Houcheng Jiang, Kun Wang, Yunshan Ma, Xiang Wang, Xiangnan He, Tat-Seng Chua
To address this, we introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters.
no code implementations • 30 Sep 2024 • Xiang Wang, Changxin Gao, Yuehuan Wang, Nong Sang
Recent advancements in controllable human-centric video generation, particularly with the rise of diffusion models, have demonstrated considerable progress.
1 code implementation • 28 Aug 2024 • Sihang Li, Jin Huang, Jiaxi Zhuang, Yaorui Shi, Xiaochen Cai, Mingjun Xu, Xiang Wang, Linfeng Zhang, Guolin Ke, Hengxing Cai
To develop an LLM specialized in scientific literature understanding, we propose a hybrid strategy that integrates continual pre-training (CPT) and supervised fine-tuning (SFT), to simultaneously infuse scientific domain knowledge and enhance instruction-following capabilities for domain-specific tasks. cIn this process, we identify two key challenges: (1) constructing high-quality CPT corpora, and (2) generating diverse SFT instructions.
1 code implementation • 19 Aug 2024 • Xiaoyu Kong, Jiancan Wu, An Zhang, Leheng Sheng, Hui Lin, Xiang Wang, Xiangnan He
Sequential recommendation systems predict the next interaction item based on users' past interactions, aligning recommendations with individual preferences.
1 code implementation • 3 Aug 2024 • Wenyu Mao, Jiancan Wu, Haoyang Liu, Yongduo Sui, Xiang Wang
In this work, we propose a novel framework, called Invariant Graph Learning based on Information bottleneck theory (InfoIGL), to extract the invariant features of graphs and enhance models' generalization ability to unseen distributions.
no code implementations • 29 Jul 2024 • Chen-Lu Ding, Jiancan Wu, Wei Lin, Shiyang Shen, Xiang Wang, Yancheng Yuan
ASRC obtains the final clustering results by applying RCC to the learned feature representations with their consistent graph structure and edge weights.
1 code implementation • 24 Jul 2024 • Wenyu Mao, Jiancan Wu, Weijian Chen, Chongming Gao, Xiang Wang, Xiangnan He
In this work, we introduce the concept of instance-wise prompting, aiming at personalizing discrete prompts for individual users.
1 code implementation • 19 Jul 2024 • Jinda Lu, Shuo Wang, Yanbin Hao, Haifeng Liu, Xiang Wang, Meng Wang
However, these adaptation methods are usually operated on the global view of an input image, and thus biased perception of partial local details of the image.
1 code implementation • 12 Jul 2024 • Zekai Xu, Kang You, Qinghai Guo, Xiang Wang, Zhezhi He
Spiking neural networks (SNNs), which mimic biological neural system to convey information via discrete spikes, are well known as brain-inspired models with excellent computing efficiency.
1 code implementation • 11 Jul 2024 • Junkang Wu, Yuexiang Xie, Zhengyi Yang, Jiancan Wu, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He
Direct Preference Optimization (DPO) has emerged as a compelling approach for training Large Language Models (LLMs) to adhere to human preferences.
1 code implementation • 10 Jul 2024 • Junkang Wu, Yuexiang Xie, Zhengyi Yang, Jiancan Wu, Jiawei Chen, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He
We categorize noise into pointwise noise, which includes low-quality data points, and pairwise noise, which encompasses erroneous data pair associations that affect preference rankings.
1 code implementation • 10 Jul 2024 • An Zhang, Han Wang, Xiang Wang, Tat-Seng Chua
Domain Generalization (DG), designed to enhance out-of-distribution (OOD) generalization, is all about learning invariance against domain shifts utilizing sufficient supervision signals.
1 code implementation • 7 Jul 2024 • Leheng Sheng, An Zhang, Yi Zhang, Yuxin Chen, Xiang Wang, Tat-Seng Chua
Contrary to prevailing understanding that LMs and traditional recommenders learn two distinct representation spaces due to the huge gap in language and behavior modeling objectives, this work re-examines such understanding and explores extracting a recommendation space directly from the language representation space.
1 code implementation • 4 Jul 2024 • Chengpeng Li, Guanting Dong, Mingfeng Xue, Ru Peng, Xiang Wang, Dayiheng Liu
In this paper, we introduce a series of LLMs that employs the Decomposition of thought with code assistance and self-correction for mathematical reasoning, dubbed as DotaMath.
1 code implementation • 18 Jun 2024 • Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Chuchu Han, Xiaonan Huang, Changxin Gao, Yuehuan Wang, Nong Sang
We train a lightweight temporal sampler to select frames with high anomaly response and fine-tune a multimodal large language model (LLM) to generate explanatory content.
1 code implementation • 13 Jun 2024 • Yuxin Chen, Junfei Tan, An Zhang, Zhengyi Yang, Leheng Sheng, Enzhi Zhang, Xiang Wang, Tat-Seng Chua
Specifically, we incorporate multiple negatives in user preference data and devise an alternative version of DPO loss tailored for LM-based recommenders, which is extended from the traditional full-ranking Plackett-Luce (PL) model to partial rankings and connected to softmax sampling strategies.
1 code implementation • 9 Jun 2024 • Hao Li, Chenghao Yang, An Zhang, Yang Deng, Xiang Wang, Tat-Seng Chua
Crucial to addressing this real-world need are event summary and persona management, which enable reasoning for appropriate long-term dialogue responses.
2 code implementations • 5 Jun 2024 • Kang You, Zekai Xu, Chen Nie, Zhijie Deng, Qinghai Guo, Xiang Wang, Zhezhi He
Spiking neural network (SNN) has attracted great attention due to its characteristic of high efficiency and accuracy.
2 code implementations • 3 Jun 2024 • Xiang Wang, Shiwei Zhang, Changxin Gao, Jiayu Wang, Xiaoqiang Zhou, Yingya Zhang, Luxin Yan, Nong Sang
First, to reduce the optimization difficulty and ensure temporal coherence, we map the reference image along with the posture guidance and noise video into a common feature space by incorporating a unified video diffusion model.
no code implementations • 24 May 2024 • Yuyue Zhao, Jiancan Wu, Xiang Wang, Wei Tang, Dingxian Wang, Maarten de Rijke
Through the integration of LLMs, ToolRec enables conventional recommender systems to become external tools with a natural language interface.
1 code implementation • 23 May 2024 • Zhiyuan Liu, Yaorui Shi, An Zhang, Sihang Li, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua
To resolve the challenges above, we propose a new pretraining method, ReactXT, for reaction-text modeling, and a new dataset, OpenExp, for experimental procedure prediction.
1 code implementation • 21 May 2024 • Zhiyuan Liu, An Zhang, Hao Fei, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua
ProtT3 empowers an LM to understand protein sequences of amino acids by incorporating a PLM as its protein understanding module, enabling effective protein-to-text generation.
no code implementations • 8 May 2024 • Yongheng Zhang, Tingwen Du, Yunshan Ma, Xiang Wang, Yi Xie, Guozheng Yang, Yuliang Lu, Ee-Chien Chang
Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+.
1 code implementation • 23 Apr 2024 • Buyun He, Yingguang Yang, Qi Wu, Hao liu, Renyu Yang, Hao Peng, Xiang Wang, Yong Liao, Pengyuan Zhou
To tackle these challenges, we propose BotDGT, a novel framework that not only considers the topological structure, but also effectively incorporates dynamic nature of social network.
no code implementations • 2 Apr 2024 • Yunshan Ma, Yingzhi He, Wenjun Zhong, Xiang Wang, Roger Zimmermann, Tat-Seng Chua
However, the cross-item relations have been under-explored in the current multimodal pre-train models.
1 code implementation • CVPR 2024 • Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian
Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications.
no code implementations • 14 Mar 2024 • Liangrui Pan, Yijun Peng, Yan Li, Xiang Wang, Wenjuan Liu, Liwen Xu, Qingchun Liang, Shaoliang Peng
To mitigate the impact of missing features within the modality on prediction accuracy, we devised a convolutional masked autoencoder (CMAE) to process the heterogeneous graph post-feature reconstruction.
1 code implementation • 13 Mar 2024 • Zhishuai Li, Xiang Wang, Jingjing Zhao, Sun Yang, Guoqing Du, Xiaoru Hu, Bin Zhang, Yuxiao Ye, Ziyue Li, Rui Zhao, Hangyu Mao
Then, in the first stage, question-SQL pairs are retrieved as few-shot demonstrations, prompting the LLM to generate a preliminary SQL (PreSQL).
Ranked #2 on
Text-To-SQL
on spider
1 code implementation • 10 Mar 2024 • Huaxin Zhang, Xiang Wang, Xiaohao Xu, Xiaonan Huang, Chuchu Han, Yuehuan Wang, Changxin Gao, Shanjun Zhang, Nong Sang
In recent years, video anomaly detection has been extensively investigated in both unsupervised and weakly supervised settings to alleviate costly temporal labeling.
no code implementations • 27 Feb 2024 • Tailai Wen, Da Ke, Xiang Wang, Zhitao Huang
Deep learning algorithms have become an essential component in the field of cognitive radio, especially playing a pivotal role in automatic modulation classification.
1 code implementation • 21 Feb 2024 • An Zhang, Wenchang Ma, Pengbo Wei, Leheng Sheng, Xiang Wang
However, we have discovered that this aggregation mechanism comes with a drawback, which amplifies biases present in the interaction graph.
1 code implementation • 6 Feb 2024 • Junfeng Fang, Shuai Zhang, Chang Wu, Zhengyi Yang, Zhiyuan Liu, Sihang Li, Kun Wang, Wenjie Du, Xiang Wang
Molecular Relational Learning (MRL), aiming to understand interactions between molecular pairs, plays a pivotal role in advancing biochemical research.
no code implementations • 5 Feb 2024 • Junfeng Fang, Xinglin Li, Yongduo Sui, Yuan Gao, Guibin Zhang, Kun Wang, Xiang Wang, Xiangnan He
Graph representation learning on vast datasets, like web data, has made significant strides.
no code implementations • 5 Feb 2024 • Yuan Gao, Haokun Chen, Xiang Wang, Zhicai Wang, Xue Wang, Jinyang Gao, Bolin Ding
Our research demonstrates the efficacy of leveraging AIGS and the DiffsFormer architecture to mitigate data scarcity in stock forecasting tasks.
1 code implementation • 25 Jan 2024 • Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian
Through 3D molecule-text alignment and 3D molecule-centric instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder and LM.
1 code implementation • 25 Jan 2024 • Yuan Gao, Xiang Wang, Xiangnan He, Zhenguang Liu, Huamin Feng, Yongdong Zhang
Graph anomaly detection (GAD) is a challenging binary classification problem due to its different structural distribution between anomalies and normal nodes -- abnormal nodes are a minority, therefore holding high heterophily and low homophily compared to normal nodes.
2 code implementations • 10 Jan 2024 • Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu
Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.
1 code implementation • CVPR 2024 • Yicong Li, Na Zhao, Junbin Xiao, Chun Feng, Xiang Wang, Tat-Seng Chua
With this regard we propose a novel task Language-guided Affordance Segmentation on 3D Object (LASO) which challenges a model to segment a 3D object's part relevant to a given affordance question.
1 code implementation • CVPR 2024 • Xiang Wang, Shiwei Zhang, Hangjie Yuan, Zhiwu Qing, Biao Gong, Yingya Zhang, Yujun Shen, Changxin Gao, Nong Sang
Following such a pipeline, we study the effect of doubling the scale of training set (i. e., video-only WebVid10M) with some randomly collected text-free videos and are encouraged to observe the performance improvement (FID from 9. 67 to 8. 19 and FVD from 484 to 441), demonstrating the scalability of our approach.
Ranked #7 on
Text-to-Video Generation
on MSR-VTT
1 code implementation • 20 Dec 2023 • Junkang Wu, Jiawei Chen, Jiancan Wu, Wentao Shi, Jizhi Zhang, Xiang Wang
Loss functions steer the optimization direction of recommendation models and are critical to model performance, but have received relatively little attention in recent recommendation research.
1 code implementation • CVPR 2024 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Yujie Wei, Tao Feng, Yining Pan, Yingya Zhang, Ziwei Liu, Samuel Albanie, Dong Ni
To tackle this problem, we propose InstructVideo to instruct text-to-video diffusion models with human feedback by reward fine-tuning.
1 code implementation • 15 Dec 2023 • Yifeng Ma, Shiwei Zhang, Jiayu Wang, Xiang Wang, Yingya Zhang, Zhidong Deng
To more conveniently specify personalized emotions, a diffusion-based style predictor is utilized to predict the personalized emotion directly from the audio, eliminating the need for extra emotion reference.
2 code implementations • 14 Dec 2023 • Xiang Wang, Shiwei Zhang, Han Zhang, Yu Liu, Yingya Zhang, Changxin Gao, Nong Sang
Consistency models have demonstrated powerful capability in efficient image generation and allowed synthesis within a few sampling steps, alleviating the high computational cost in diffusion models.
1 code implementation • CVPR 2024 • Zhiwu Qing, Shiwei Zhang, Jiayu Wang, Xiang Wang, Yujie Wei, Yingya Zhang, Changxin Gao, Nong Sang
At the structure level, we decompose the T2V task into two steps, including spatial reasoning and temporal reasoning, using a unified denoiser.
Ranked #6 on
Text-to-Video Generation
on MSR-VTT
1 code implementation • 5 Dec 2023 • Jiayi Liao, Sihang Li, Zhengyi Yang, Jiancan Wu, Yancheng Yuan, Xiang Wang, Xiangnan He
Treating the "sequential behaviors of users" as a distinct modality beyond texts, we employ a projector to align the traditional recommender's ID embeddings with the LLM's input space.
1 code implementation • 2 Dec 2023 • Yunshan Ma, Chenchen Ye, Zijian Wu, Xiang Wang, Yixin Cao, Liang Pang, Tat-Seng Chua
Temporal complex event forecasting aims to predict the future events given the observed events from history.
1 code implementation • 28 Nov 2023 • Yunshan Ma, Yingzhi He, Xiang Wang, Yinwei Wei, Xiaoyu Du, Yuyangzi Fu, Tat-Seng Chua
It does, however, have two limitations: 1) the two-view formulation does not fully exploit all the heterogeneous relations among users, bundles and items; and 2) the "early contrast and late fusion" framework is less effective in capturing user preference and difficult to generalize to multiple views.
3 code implementations • 7 Nov 2023 • Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, Jingren Zhou
By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos.
no code implementations • 1 Nov 2023 • You Zhou, Xiujing Lin, Xiang Zhang, Maolin Wang, Gangwei Jiang, Huakang Lu, Yupeng Wu, Kai Zhang, Zhe Yang, Kehang Wang, Yongduo Sui, Fengwei Jia, Zuoli Tang, Yao Zhao, Hongxuan Zhang, Tiannuo Yang, Weibo Chen, Yunong Mao, Yi Li, De Bao, Yu Li, Hongrui Liao, Ting Liu, Jingwen Liu, Jinchi Guo, Xiangyu Zhao, Ying WEI, Hong Qian, Qi Liu, Xiang Wang, Wai Kin, Chan, Chenliang Li, Yusen Li, Shiyu Yang, Jining Yan, Chao Mou, Shuai Han, Wuxia Jin, Guannan Zhang, Xiaodong Zeng
To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic.
2 code implementations • 31 Oct 2023 • Zhengyi Yang, Jiancan Wu, Yanchen Luo, Jizhi Zhang, Yancheng Yuan, An Zhang, Xiang Wang, Xiangnan He
Sequential recommendation is to predict the next item of interest for a user, based on her/his interaction history with previous items.
1 code implementation • NeurIPS 2023 • Zhengyi Yang, Jiancan Wu, Zhicai Wang, Xiang Wang, Yancheng Yuan, Xiangnan He
Scrutinizing previous studies, we can summarize a common learning-to-classify paradigm -- given a positive item, a recommender model performs negative sampling to add negative items and learns to classify whether the user prefers them or not, based on his/her historical interaction sequence.
1 code implementation • NeurIPS 2023 • An Zhang, Leheng Sheng, Zhibo Cai, Xiang Wang, Tat-Seng Chua
To bridge the gap, we delve into the reasons underpinning the success of contrastive loss in CF, and propose a principled Adversarial InfoNCE loss (AdvInfoNCE), which is a variant of InfoNCE, specially tailored for CF methods.
1 code implementation • 28 Oct 2023 • Yunshan Ma, Xiaohao Liu, Yinwei Wei, Zhulin Tao, Xiang Wang, Tat-Seng Chua
Specifically, we use self-attention modules to combine the multimodal and multi-item features, and then leverage both item- and bundle-level contrastive learning to enhance the representation learning, thus to counter the modality missing, noise, and sparsity problems.
no code implementations • 25 Oct 2023 • Chengpeng Li, Zhengyi Yang, Jizhi Zhang, Jiancan Wu, Dingxian Wang, Xiangnan He, Xiang Wang
Therefore, the data sparsity issue of reward signals and state transitions is very severe, while it has long been overlooked by existing RL recommenders. Worse still, RL methods learn through the trial-and-error mode, but negative feedback cannot be obtained in implicit feedback recommendation tasks, which aggravates the overestimation problem of offline RL recommender.
1 code implementation • NeurIPS 2023 • Zhiyuan Liu, Yaorui Shi, An Zhang, Enzhi Zhang, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua
Our results show that a subgraph-level tokenizer and a sufficiently expressive decoder with remask decoding have a large impact on the encoder's representation learning.
1 code implementation • 20 Oct 2023 • Yaorui Shi, An Zhang, Enzhi Zhang, Zhiyuan Liu, Xiang Wang
Predicting chemical reactions, a fundamental challenge in chemistry, involves forecasting the resulting products from a given reaction process.
1 code implementation • 19 Oct 2023 • Zhiyuan Liu, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua
MolCA enables an LM (e. g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector.
Ranked #7 on
Molecule Captioning
on ChEBI-20
1 code implementation • 18 Oct 2023 • Ruihao Shui, Yixin Cao, Xiang Wang, Tat-Seng Chua
Large language models (LLMs) have demonstrated great potential for domain-specific applications, such as the law domain.
no code implementations • 16 Oct 2023 • Xiang Wang, Shiwei Zhang, Hangjie Yuan, Yingya Zhang, Changxin Gao, Deli Zhao, Nong Sang
In this paper, we develop an effective plug-and-play framework called CapFSAR to exploit the knowledge of multimodal models without manually annotating text.
1 code implementation • 16 Oct 2023 • An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-Seng Chua
Recommender systems are the cornerstone of today's information dissemination, yet a disconnect between offline metrics and online performance greatly hinders their development.
1 code implementation • 16 Oct 2023 • An Zhang, Wenchang Ma, Jingnan Zheng, Xiang Wang, Tat-Seng Chua
The popularity shortcut tricks are good for in-distribution (ID) performance but poorly generalized to out-of-distribution (OOD) data, i. e., when popularity distribution of test data shifts w. r. t.
1 code implementation • 9 Oct 2023 • Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou
In this paper, we conduct an investigation for such data augmentation in math reasoning and are intended to answer: (1) What strategies of data augmentation are more effective; (2) What is the scaling relationship between the amount of augmented data and model performance; and (3) Can data augmentation incentivize generalization to out-of-domain mathematical reasoning tasks?
Ranked #60 on
Arithmetic Reasoning
on GSM8K
(using extra training data)
no code implementations • 26 Sep 2023 • Jiayi Liao, Xu Chen, Qiang Fu, Lun Du, Xiangnan He, Xiang Wang, Shi Han, Dongmei Zhang
Recent years have witnessed the substantial progress of large-scale models across various domains, such as natural language processing and computer vision, facilitating the expression of concrete concepts.
1 code implementation • 24 Aug 2023 • Huaxin Zhang, Xiang Wang, Xiaohao Xu, Zhiwu Qing, Changxin Gao, Nong Sang
For snippet-level learning, we introduce an online-updated memory to store reliable snippet prototypes for each class.
Ranked #1 on
Weakly Supervised Action Localization
on BEOID
3 code implementations • ICCV 2023 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Samuel Albanie, Yining Pan, Tao Feng, Jianwen Jiang, Dong Ni, Yingya Zhang, Deli Zhao
In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.
Ranked #1 on
Zero-Shot Human-Object Interaction Detection
on HICO-DET
(using extra training data)
1 code implementation • 12 Aug 2023 • Yunshan Ma, Chenchen Ye, Zijian Wu, Xiang Wang, Yixin Cao, Tat-Seng Chua
The task of event forecasting aims to model the relational and temporal patterns based on historical events and makes forecasting to what will happen in the future.
5 code implementations • 12 Aug 2023 • Jiuniu Wang, Hangjie Yuan, Dayou Chen, Yingya Zhang, Xiang Wang, Shiwei Zhang
This paper introduces ModelScopeT2V, a text-to-video synthesis model that evolves from a text-to-image synthesis model (i. e., Stable Diffusion).
Ranked #9 on
Text-to-Video Generation
on MSR-VTT
1 code implementation • 8 Aug 2023 • Wei Ji, Xiangyan Liu, An Zhang, Yinwei Wei, Yongxin Ni, Xiang Wang
To be specific, we first introduce an ID-aware Multi-modal Transformer module in the item representation learning stage to facilitate information interaction among different features.
no code implementations • 7 Aug 2023 • Yicong Li, Xun Yang, An Zhang, Chun Feng, Xiang Wang, Tat-Seng Chua
This paper identifies two kinds of redundancy in the current VideoQA paradigm.
1 code implementation • 1 Aug 2023 • Hengchu Lu, Yuanjie Shao, Xiang Wang, Changxin Gao
In this way, the proposed ASC enables explicit transfer of source domain knowledge to prevent the model from overfitting the target domain.
1 code implementation • ICCV 2023 • Yicong Li, Junbin Xiao, Chun Feng, Xiang Wang, Tat-Seng Chua
We then conduct extensive studies to verify the importance of STR as well as the proposed answer interaction mechanism.
1 code implementation • SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023 • Yinwei Wei, Wenqi Liu, Fan Liu, Xiang Wang, Liqiang Nie, Tat-Seng Chua
Considering its challenges in effectiveness and efficiency, we propose a novel Transformer-based recommendation model, termed as Light Graph Transformer model (LightGT).
Ranked #1 on
Multi-Media Recommendation
on Kwai
(Recall@10 metric)
1 code implementation • 9 Jul 2023 • Liangrui Pan, Xiang Wang, Qingchun Liang, Jiandong Shang, Wenjuan Liu, Liwen Xu, Shaoliang Peng
Methods: We propose a model, named DEDUCE, based on a symmetric multi-head attention encoders (SMAE), for unsupervised contrastive learning to analyze multi-omics cancer data, with the aim of identifying and characterizing cancer subtypes.
1 code implementation • 5 Jun 2023 • Fangfu Liu, Wenchang Ma, An Zhang, Xiang Wang, Yueqi Duan, Tat-Seng Chua
Discovering causal structure from purely observational data (i. e., causal discovery), aiming to identify causal relationships among variables, is a fundamental task in machine learning.
4 code implementations • NeurIPS 2023 • Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou
The pursuit of controllability as a higher standard of visual content creation has yielded remarkable progress in customizable image synthesis.
Ranked #5 on
Text-to-Video Generation
on EvalCrafter Text-to-Video (ECTV) Dataset
(using extra training data)
1 code implementation • 6 Apr 2023 • Jiancan Wu, Yi Yang, Yuchun Qian, Yongduo Sui, Xiang Wang, Xiangnan He
Then, we recognize the crux to the inability of traditional influence function for graph unlearning, and devise Graph Influence Function (GIF), a model-agnostic unlearning method that can efficiently and accurately estimate parameter changes in response to a $\epsilon$-mass perturbation in deleted data.
1 code implementation • CVPR 2023 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang
To address these issues, we develop a Motion-augmented Long-short Contrastive Learning (MoLo) method that contains two crucial components, including a long-short contrastive objective and a motion autodecoder.
1 code implementation • CVPR 2023 • Jun Cen, Shiwei Zhang, Xiang Wang, Yixuan Pei, Zhiwu Qing, Yingya Zhang, Qifeng Chen
In this paper, we begin with analyzing the feature representation behavior in the open-set action recognition (OSAR) problem based on the information bottleneck (IB) theory, and propose to enlarge the instance-specific (IS) and class-specific (CS) information contained in the feature for better performance.
1 code implementation • ECML-PKDD 2023 • Sen Zhang, Senzhang Wang, Xiang Wang, Shigeng Zhang, Hao Miao & Junxing Zhu
We first project users and trajectories into the common latent feature space through learning a projection function (generator) to minimize the distance between the user distribution and the trajectory distribution.
1 code implementation • 6 Mar 2023 • Xiang Wang, Shiwei Zhang, Jun Cen, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang
Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task.
1 code implementation • 6 Mar 2023 • An Zhang, Fangfu Liu, Wenchang Ma, Zhibo Cai, Xiang Wang, Tat-Seng Chua
Despite great success in low-dimensional linear systems, it has been observed that these approaches overly exploit easier-to-fit samples, thus inevitably learning spurious edges.
1 code implementation • 10 Feb 2023 • An Zhang, Jingnan Zheng, Xiang Wang, Yancheng Yuan, Tat-Seng Chua
Collaborative Filtering (CF) models, despite their great success, suffer from severe performance drops due to popularity distribution shifts, where these changes are ubiquitous and inevitable in real-world scenarios.
1 code implementation • 9 Jan 2023 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Zhengrong Zuo, Changxin Gao, Rong Jin, Nong Sang
To be specific, HyRSM++ consists of two key components, a hybrid relation module and a temporal set matching metric.
no code implementations • ICCV 2023 • Yixuan Pei, Zhiwu Qing, Shiwei Zhang, Xiang Wang, Yingya Zhang, Deli Zhao, Xueming Qian
In this paper, we will fill this gap by learning multiple prompts based on a powerful image-language pre-trained model, i. e., CLIP, making it fit for video class-incremental learning (VCIL).
1 code implementation • 20 Dec 2022 • Yinwei Wei, Xiang Wang, Liqiang Nie, Shaoyu Li, Dingxian Wang, Tat-Seng Chua
Knowledge Graph (KG), as a side-information, tends to be utilized to supplement the collaborative filtering (CF) based recommendation model.
no code implementations • 25 Nov 2022 • Tianpeng Bao, Jiadong Chen, Wei Li, Xiang Wang, Jingjing Fei, Liwei Wu, Rui Zhao, Ye Zheng
However, existing datasets for unsupervised anomaly detection are biased towards manufacturing inspection, not considering maintenance inspection which is usually conducted under outdoor uncontrolled environment such as varying camera viewpoints, messy background and degradation of object surface after long-term working.
no code implementations • 19 Nov 2022 • Xiang Wang, Yimin Yang, Zhichang Guo, Zhili Zhou, Yu Liu, Qixiang Pang, Shan Du
First, the UBCDTN is able to produce an approximated real-like LR image through transferring the LR image from an artificially degraded domain to the real-world LR image domain.
no code implementations • 18 Nov 2022 • Xiang Wang, Yimin Yang, Qixiang Pang, Xiao Lu, Yu Liu, Shan Du
In this paper, we propose a novel face super-resolution method, namely Semantic Encoder guided Generative Adversarial Face Ultra-Resolution Network (SEGA-FURN) to ultra-resolve an unaligned tiny LR face image to its HR counterpart with multiple ultra-upscaling factors (e. g., 4x and 8x).
1 code implementation • NeurIPS 2023 • Yongduo Sui, Qitian Wu, Jiancan Wu, Qing Cui, Longfei Li, Jun Zhou, Xiang Wang, Xiangnan He
From the perspective of invariant learning and stable learning, a recently well-established paradigm for out-of-distribution generalization, stable features of the graph are assumed to causally determine labels, while environmental features tend to be unstable and can lead to the two primary types of distribution shifts.
no code implementations • 2 Nov 2022 • Yixuan Pei, Zhiwu Qing, Jun Cen, Xiang Wang, Shiwei Zhang, Yaxiong Wang, Mingqian Tang, Nong Sang, Xueming Qian
The former is to reduce the memory cost by preserving only one condensed frame instead of the whole video, while the latter aims to compensate the lost spatio-temporal details in the Frame Condensing stage.
1 code implementation • 24 Oct 2022 • Muthu Chidambaram, Xiang Wang, Chenwei Wu, Rong Ge
Mixup is a data augmentation technique that relies on training using random convex combinations of data points and their labels.
no code implementations • 20 Oct 2022 • Wei zhang, Jiaxi Cao, Xiang Wang, Enqi Tian, Bin Li
In recent years, head-mounted near-eye display devices have become the key hardware foundation for virtual reality and augmented reality.
1 code implementation • 20 Oct 2022 • An Zhang, Wenchang Ma, Xiang Wang, Tat-Seng Chua
Collaborative filtering (CF) models easily suffer from popularity bias, which makes recommendation deviate from users' actual preferences.
no code implementations • 7 Oct 2022 • Xingyu Zhu, Zixuan Wang, Xiang Wang, Mo Zhou, Rong Ge
Globally we observe that the training dynamics for our example has an interesting bifurcating behavior, which was also observed in the training of neural nets.
no code implementations • 6 Oct 2022 • Xiang Wang, Kai Wang, Xiaohong Li, Shiguo Lian
To compensate for the imbalance of different kernel numbers and classify kernels with multiple flaws accurately, we propose a multi-stage workflow which is able to locate the kernels in the captured image and classify their properties.
no code implementations • 3 Oct 2022 • Xiang Wang, Annie N. Wang, Mo Zhou, Rong Ge
Monotonic linear interpolation (MLI) - on the line connecting a random initialization with the minimizer it converges to, the loss and accuracy are monotonic - is a phenomenon that is commonly observed in the training of neural networks.
1 code implementation • 26 Jul 2022 • Yicong Li, Xiang Wang, Junbin Xiao, Tat-Seng Chua
Specifically, the equivariant grounding encourages the answering to be sensitive to the semantic changes in the causal scene and question; in contrast, the invariant grounding enforces the answering to be insensitive to the changes in the environment scene.
1 code implementation • 24 Jul 2022 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Xiang Wang, Yuehuan Wang, Yiliang Lv, Changxin Gao, Nong Sang
Inspired by this, we propose propose Masked Action Recognition (MAR), which reduces the redundant computation by discarding a proportion of patches and operating only on a part of the videos.
Ranked #11 on
Action Recognition
on Something-Something V2
no code implementations • 18 Jul 2022 • Amir H. Ashouri, Mostafa Elhoushi, Yuzhe Hua, Xiang Wang, Muhammad Asif Manzoor, Bryan Chan, Yaoqing Gao
This paper presents MLGOPerf; the first end-to-end framework capable of optimizing performance using LLVM's ML-Inliner.
1 code implementation • IEEE Transactions on Multimedia (TMM) 2022 • Zhulin Tao, Xiaohao Liu, Yewei Xia, Xiang Wang, Lifang Yang, Xianglin Huang
To capture multi-modal patterns in the data itself, we go beyond the supervised learning paradigm, and incorporate the idea of self-supervised learning (SSL) into multimedia recommendation.
Ranked #3 on
Multi-modal Recommendation
on Amazon Sports
no code implementations • 18 Jun 2022 • Xiang Wang, Huaxin Zhang, Shiwei Zhang, Changxin Gao, Yuanjie Shao, Nong Sang
This technical report presents our first place winning solution for temporal action detection task in CVPR-2022 AcitivityNet Challenge.
1 code implementation • 16 Jun 2022 • Sihang Li, Xiang Wang, An Zhang, Yingxin Wu, Xiangnan He, Tat-Seng Chua
Specifically, without supervision signals, RGCL uses a rationale generator to reveal salient features about graph instance-discrimination as the rationale, and then creates rationale-aware views for contrastive learning.
1 code implementation • CVPR 2022 • Yicong Li, Xiang Wang, Junbin Xiao, Wei Ji, Tat-Seng Chua
At its core is understanding the alignments between visual scenes in video and linguistic semantics in question to yield the answer.
1 code implementation • 1 Jun 2022 • Yunshan Ma, Yingzhi He, An Zhang, Xiang Wang, Tat-Seng Chua
Recent methods usually take advantage of both user-bundle and user-item interactions information to obtain informative representations for users and bundles, corresponding to bundle view and item view, respectively.
no code implementations • 31 May 2022 • Yu Wang, An Zhang, Xiang Wang, Yancheng Yuan, Xiangnan He, Tat-Seng Chua
This paper proposes Differentiable Invariant Causal Discovery (DICD), utilizing the multi-environment information based on a differentiable framework to avoid learning spurious edges and wrong causal directions.
no code implementations • 30 May 2022 • Ye Zheng, Xiang Wang, Yu Qi, Wei Li, Liwei Wu
From the time the MVTec AD dataset was proposed to the present, new research methods that are constantly being proposed push its precision to saturation.
no code implementations • 3 May 2022 • Zhenguang Liu, Sifan Wu, Chejian Xu, Xiang Wang, Lei Zhu, Shuang Wu, Fuli Feng
3) To enhance texture details, we encode facial features with geometric guidance and employ local GANs to refine the face, feet, and hands.
no code implementations • 2 May 2022 • Xiaohong Li, Xiang Wang, Kai Wang, Shiguo Lian
Generating synchronized and natural lip movement with speech is one of the most important tasks in creating realistic virtual characters.
1 code implementation • CVPR 2022 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Mingqian Tang, Zhengrong Zuo, Changxin Gao, Rong Jin, Nong Sang
To overcome the two limitations, we propose a novel Hybrid Relation guided Set Matching (HyRSM) approach that incorporates two key components: hybrid relation module and set matching metric.
1 code implementation • 26 Apr 2022 • Qi Wan, Xiangnan He, Xiang Wang, Jiancan Wu, Wei Guo, Ruiming Tang
In this work, we develop a new learning paradigm named Cross Pairwise Ranking (CPR) that achieves unbiased recommendation without knowing the exposure mechanism.
1 code implementation • 23 Apr 2022 • Xiang Wang, Yingxin Wu, An Zhang, Fuli Feng, Xiangnan He, Tat-Seng Chua
Such reward accounts for the dependency of the newly-added edge and the previously-added edges, thus reflecting whether they collaborate together and form a coalition to pursue better explanations.
no code implementations • 19 Apr 2022 • Yuan Gao, Xiang Wang, Xiangnan He, Huamin Feng, Yongdong Zhang
At the core is to model the rumor characteristics inherent in rich information, such as propagation patterns in social network and semantic patterns in post content, and differentiate them from the truth.
no code implementations • CVPR 2022 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Yi Xu, Xiang Wang, Mingqian Tang, Changxin Gao, Rong Jin, Nong Sang
In this work, we aim to learn representations by leveraging more abundant information in untrimmed videos.
1 code implementation • CVPR 2022 • Zhenguang Liu, Runyang Feng, Haoming Chen, Shuang Wu, Yixing Gao, Yunjun Gao, Xiang Wang
State-of-the-art methods strive to incorporate additional visual evidences from neighboring frames (supporting frames) to facilitate the pose estimation of the current frame (key frame).
1 code implementation • CVPR 2022 • Jiawei Zhang, Xiang Wang, Xiao Bai, Chen Wang, Lei Huang, Yimin Chen, Lin Gu, Jun Zhou, Tatsuya Harada, Edwin R. Hancock
The stereo contrastive feature loss function explicitly constrains the consistency between learned features of matching pixel pairs which are observations of the same 3D points.
1 code implementation • ICLR 2022 • Ying-Xin Wu, Xiang Wang, An Zhang, Xiangnan He, Tat-Seng Chua
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features -- rationale -- which guides the model prediction.
no code implementations • 21 Jan 2022 • Ying-Xin Wu, Xiang Wang, An Zhang, Xia Hu, Fuli Feng, Xiangnan He, Tat-Seng Chua
In this work, we propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction.
1 code implementation • 14 Jan 2022 • Zhiyuan Liu, Yixin Cao, Fuli Feng, Xiang Wang, Jie Tang, Kenji Kawaguchi, Tat-Seng Chua
We present a framework of Training Free Graph Matching (TFGM) to boost the performance of Graph Neural Networks (GNNs) based graph matching, providing a fast promising solution without training (training-free).
no code implementations • 7 Jan 2022 • Jiancan Wu, Xiang Wang, Xingyu Gao, Jiawei Chen, Hongcheng Fu, Tianyu Qiu
In this work, we aim to offer a better understanding of SSM for item recommendation.
1 code implementation • 30 Dec 2021 • Yongduo Sui, Xiang Wang, Jiancan Wu, Min Lin, Xiangnan He, Tat-Seng Chua
To endow the classifier with better interpretation and generalization, we propose the Causal Attention Learning (CAL) strategy, which discovers the causal patterns and mitigates the confounding effect of shortcuts.
1 code implementation • NeurIPS 2021 • Xiang Wang, Yingxin Wu, An Zhang, Xiangnan He, Tat-Seng Chua
A performant paradigm towards multi-grained explainability is until-now lacking and thus a focus of our work.
1 code implementation • 18 Nov 2021 • Xiang Bai, Hanchen Wang, Liya Ma, Yongchao Xu, Jiefeng Gan, Ziwei Fan, Fan Yang, Ke Ma, Jiehua Yang, Song Bai, Chang Shu, Xinyu Zou, Renhao Huang, Changzheng Zhang, Xiaowu Liu, Dandan Tu, Chuou Xu, Wenqing Zhang, Xi Wang, Anguo Chen, Yu Zeng, Dehua Yang, Ming-Wei Wang, Nagaraj Holalkere, Neil J. Halin, Ihab R. Kamel, Jia Wu, Xuehua Peng, Xiang Wang, Jianbo Shao, Pattanasak Mongkolwat, Jianjun Zhang, Weiyang Liu, Michael Roberts, Zhongzhao Teng, Lucian Beer, Lorena Escudero Sanchez, Evis Sala, Daniel Rubin, Adrian Weller, Joan Lasenby, Chuangsheng Zheng, Jianming Wang, Zhen Li, Carola-Bibiane Schönlieb, Tian Xia
Artificial intelligence (AI) provides a promising substitution for streamlining COVID-19 diagnoses.
5 code implementations • 15 Nov 2021 • Jiawei Yu, Ye Zheng, Xiang Wang, Wei Li, Yushuang Wu, Rui Zhao, Liwei Wu
However, current methods can not effectively map image features to a tractable base distribution and ignore the relationship between local and global features which are important to identify anomalies.
Ranked #28 on
Anomaly Detection
on MVTec LOCO AD
Unsupervised Anomaly Detection
Weakly Supervised Defect Detection
no code implementations • 10 Nov 2021 • Yi Lin, Jianchao Su, Xiang Wang, Xiang Li, Jingen Liu, Kwang-Ting Cheng, Xin Yang
We have evaluated our approach using the 20 CTPA test dataset from the PE challenge, achieving a sensitivity of 78. 9%, 80. 7% and 80. 7% at 2 false positives per volume at 0mm, 2mm and 5mm localization error, which is superior to the state-of-the-art methods.
1 code implementation • ICLR 2022 • Muthu Chidambaram, Xiang Wang, Yuzheng Hu, Chenwei Wu, Rong Ge
Despite seeing very few true data points during training, models trained using Mixup seem to still minimize the original empirical risk and exhibit better generalization and robustness on various tasks when compared to standard training.
2 code implementations • 11 Oct 2021 • Xiang Wang, Xinlei Chen, Simon S. Du, Yuandong Tian
Non-contrastive methods of self-supervised learning (such as BYOL and SimSiam) learn representations by minimizing the distance between two views of the same image.
no code implementations • 9 Oct 2021 • Ye Zheng, Xiang Wang, Rui Deng, Tianpeng Bao, Rui Zhao, Liwei Wu
To facilitate the learning with only normal images, we propose a new pretext task called non-contrastive learning for the fine alignment stage.
Ranked #71 on
Anomaly Detection
on MVTec AD
no code implementations • 29 Sep 2021 • Yongduo Sui, Xiang Wang, Tianlong Chen, Xiangnan He, Tat-Seng Chua
In this work, we propose a simple and effective learning paradigm, Inductive Co-Pruning of GNNs (ICPG), to endow graph lottery tickets with inductive pruning capacity.
1 code implementation • 5 Aug 2021 • Yuyue Zhao, Xiang Wang, Jiawei Chen, Yashen Wang, Wei Tang, Xiangnan He, Haiyong Xie
In this work, we propose a novel Time-aware Path reasoning for Recommendation (TPRec for short) method, which leverages the potential of temporal information to offer better recommendation with plausible explanations.
1 code implementation • 2 Aug 2021 • Yanfang Wang, Yongduo Sui, Xiang Wang, Zhenguang Liu, Xiangnan He
We get inspirations from the recently proposed lottery ticket hypothesis (LTH), which argues that the dense and over-parameterized model contains a much smaller and sparser sub-model that can reach comparable performance to the full model.
1 code implementation • 1 Aug 2021 • Zunlei Feng, Lechao Cheng, Xinchao Wang, Xiang Wang, Yajie Liu, Xiangtong Du, Mingli Song
To this end, we propose a Translation Segmentation Network (Trans-Net), which comprises a segmentation network and two boundary discriminators.
1 code implementation • 12 Jul 2021 • Yinwei Wei, Xiang Wang, Qi Li, Liqiang Nie, Yan Li, Xuanping Li, Tat-Seng Chua
It aims to maximize the mutual dependencies between item content and collaborative signals.
no code implementations • 24 Jun 2021 • Zhiwu Qing, Xiang Wang, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang
Temporal action localization aims to localize starting and ending time with action category.
1 code implementation • ICCV 2021 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Zhengrong Zuo, Changxin Gao, Nong Sang
Most recent approaches for online action detection tend to apply Recurrent Neural Network (RNN) to capture long-range temporal structure.
Ranked #8 on
Online Action Detection
on THUMOS'14
1 code implementation • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Nong Sang
We calculate the detection results by assigning the proposals with corresponding classification results.
Ranked #3 on
Temporal Action Localization
on ActivityNet-1.3
(using extra training data)
no code implementations • 20 Jun 2021 • Xiang Wang, Zhiwu Qing, Ziyuan Huang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Yuanjie Shao, Nong Sang
Then our proposed Local-Global Background Modeling Network (LGBM-Net) is trained to localize instances by using only video-level labels based on Multi-Instance Learning (MIL).
Weakly-supervised Learning
Weakly-supervised Temporal Action Localization
+1
1 code implementation • 17 Jun 2021 • Zhenguang Liu, Peng Qian, Xiang Wang, Lei Zhu, Qinming He, Shouling Ji
In this paper, we explore combining deep learning with expert patterns in an explainable fashion.
no code implementations • 15 Jun 2021 • Yutong Feng, Jianwen Jiang, Ziyuan Huang, Zhiwu Qing, Xiang Wang, Shiwei Zhang, Mingqian Tang, Yue Gao
This paper presents our solution to the AVA-Kinetics Crossover Challenge of ActivityNet workshop at CVPR 2021.
Ranked #4 on
Spatio-Temporal Action Localization
on AVA-Kinetics
(using extra training data)
1 code implementation • 13 Jun 2021 • Zhiwu Qing, Ziyuan Huang, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Changxin Gao, Marcelo H. Ang Jr, Nong Sang
This technical report analyzes an egocentric video action detection method we used in the 2021 EPIC-KITCHENS-100 competition hosted in CVPR2021 Workshop.
no code implementations • NeurIPS 2021 • Rong Ge, Yunwei Ren, Xiang Wang, Mo Zhou
In this paper we study the training dynamics for gradient flow on over-parametrized tensor decomposition problems.
1 code implementation • 9 Jun 2021 • Ziyuan Huang, Zhiwu Qing, Xiang Wang, Yutong Feng, Shiwei Zhang, Jianwen Jiang, Zhurong Xia, Mingqian Tang, Nong Sang, Marcelo H. Ang Jr
In this paper, we present empirical results for training a stronger video vision transformer on the EPIC-KITCHENS-100 Action Recognition dataset.
1 code implementation • 22 May 2021 • Wenjie Wang, Fuli Feng, Xiangnan He, Xiang Wang, Tat-Seng Chua
In this work, we scrutinize the cause-effect factors for bias amplification, identifying the main reason lies in the confounder effect of imbalanced item distribution on user representation and prediction score.
1 code implementation • 27 Apr 2021 • Le Wu, Xiangnan He, Xiang Wang, Kun Zhang, Meng Wang
Influenced by the great success of deep learning in computer vision and language understanding, research in recommendation has shifted to inventing new recommender models based on neural networks.
no code implementations • 12 Apr 2021 • An Zhang, Xiang Wang, Chengfang Fang, Jie Shi, Tat-Seng Chua, Zehua Chen
Gradient-based attribution methods can aid in the understanding of convolutional neural networks (CNNs).
1 code implementation • CVPR 2021 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Changxin Gao, Nong Sang
In this paper, we focus on applying the power of self-supervised methods to improve semi-supervised action proposal generation.
Ranked #2 on
Semi-Supervised Action Detection
on ActivityNet-1.3
1 code implementation • CVPR 2021 • Zhiwu Qing, Haisheng Su, Weihao Gan, Dongliang Wang, Wei Wu, Xiang Wang, Yu Qiao, Junjie Yan, Changxin Gao, Nong Sang
In this paper, we propose Temporal Context Aggregation Network (TCANet) to generate high-quality action proposals through "local and global" temporal context aggregation and complementary as well as progressive boundary refinement.
Ranked #10 on
Temporal Action Localization
on ActivityNet-1.3
no code implementations • 10 Mar 2021 • Xiang Wang, Xiaoyong Li, Junxing Zhu, Zichen Xu, Kaijun Ren, Weiming Zhang, Xinwang Liu, Kui Yu
Real-world data usually have high dimensionality and it is important to mitigate the curse of dimensionality.
2 code implementations • 14 Feb 2021 • Xiang Wang, Tinglin Huang, Dingxian Wang, Yancheng Yuan, Zhenguang Liu, Xiangnan He, Tat-Seng Chua
In this study, we explore intents behind a user-item interaction by using auxiliary item knowledge, and propose a new model, Knowledge Graph-based Intent Network (KGIN).
no code implementations • 1 Jan 2021 • Xiang Wang, Yingxin Wu, An Zhang, Xiangnan He, Tat-Seng Chua
In this work, we focus on the causal interpretability in GNNs and propose a method, Causal Screening, from the perspective of cause-effect.
no code implementations • NeurIPS 2020 • Xiang Wang, Chenwei Wu, Jason D. Lee, Tengyu Ma, Rong Ge
We show that in a lazy training regime (similar to the NTK regime for neural networks) one needs at least $m = \Omega(d^{l-1})$, while a variant of gradient descent can find an approximate tensor when $m = O^*(r^{2. 5l}\log d)$.
3 code implementations • 21 Oct 2020 • Jiancan Wu, Xiang Wang, Fuli Feng, Xiangnan He, Liang Chen, Jianxun Lian, Xing Xie
In this work, we explore self-supervised learning on user-item graph, so as to improve the accuracy and robustness of GCNs for recommendation.
Ranked #6 on
Collaborative Filtering
on Yelp2018
1 code implementation • 7 Oct 2020 • Jiawei Chen, Hande Dong, Xiang Wang, Fuli Feng, Meng Wang, Xiangnan He
This motivates us to provide a systematic survey of existing work on RS biases.
no code implementations • 7 Aug 2020 • Xiang Wang, Changxin Gao, Shiwei Zhang, Nong Sang
By this means, the proposed MLTPN can learn rich and discriminative features for different action instances with different durations.
2 code implementations • 3 Jul 2020 • Xiang Wang, Hongye Jin, An Zhang, Xiangnan He, Tong Xu, Tat-Seng Chua
Such uniform approach to model user interests easily results in suboptimal representations, failing to model diverse relationships and disentangle user intents in representations.
no code implementations • 1 Jul 2020 • Wenqiang Lei, Gangyi Zhang, Xiangnan He, Yisong Miao, Xiang Wang, Liang Chen, Tat-Seng Chua
Traditional recommendation systems estimate user preference on items from past interaction history, thus suffering from the limitations of obtaining fine-grained and dynamic user preference.
1 code implementation • 30 Jun 2020 • Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge
Solving this problem using a learning-to-learn approach -- using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates -- was recently shown to be effective.
1 code implementation • 13 Jun 2020 • Xiang Wang, Baiteng Ma, Zhiwu Qing, Yongpeng Sang, Changxin Gao, Shiwei Zhang, Nong Sang
In this report, we present our solution for the task of temporal action localization (detection) (task 1) in ActivityNet Challenge 2020.
no code implementations • 13 Jun 2020 • Zhiwu Qing, Xiang Wang, Yongpeng Sang, Changxin Gao, Shiwei Zhang, Nong Sang
This technical report analyzes a temporal action localization method we used in the HACS competition which is hosted in Activitynet Challenge 2020. The goal of our task is to locate the start time and end time of the action in the untrimmed video, and predict action category. Firstly, we utilize the video-level feature information to train multiple video-level action classification models.
1 code implementation • 26 May 2020 • Xingchen Li, Xiang Wang, Xiangnan He, Long Chen, Jun Xiao, Tat-Seng Chua
Fashion outfit recommendation has attracted increasing attentions from online shopping services and fashion communities. Distinct from other scenarios (e. g., social networking or content sharing) which recommend a single item (e. g., a friend or picture) to a user, outfit recommendation predicts user preference on a set of well-matched fashion items. Hence, performing high-quality personalized outfit recommendation should satisfy two requirements -- 1) the nice compatibility of fashion items and 2) the consistence with user preference.
no code implementations • 15 May 2020 • Xiaoxiao Li, Xiaopeng Guo, Liye Mei, Mingyu Shang, Jie Gao, Maojing Shu, Xiang Wang
The core of VP model is to decompose the light source into light intensity and light spatial distribution to describe the perception process of HVS, offering refinement estimation of illumination and reflectance.
1 code implementation • 21 Mar 2020 • Kun Xiao, Shaochang Tan, Guohui Wang, Xueyan An, Xiang Wang, Xiangke Wang
A customizable multi-rotor UAVs simulation platform based on ROS, Gazebo and PX4 is presented.
Robotics
1 code implementation • 12 Mar 2020 • Xiang Wang, Yaokun Xu, Xiangnan He, Yixin Cao, Meng Wang, Tat-Seng Chua
Properly handling missing data is a fundamental challenge in recommendation.