Search Results for author: Wei zhang

Found 535 papers, 161 papers with code

HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction

no code implementations • ECCV 2020 • Tianjiao Li, Jun Liu, Wei zhang, Ling-Yu Duan

In this paper, we propose a novel Hardness-AwaRe Discrimination Network (HARD-Net) to specifically investigate the relationships between the similar activity pairs that are hard to be discriminated.

Ranked #6 on Skeleton Based Action Recognition on UAV-Human

Activity Prediction Skeleton Based Action Recognition

Paper
Add Code

Unsupervised Multi-View CNN for Salient View Selection of 3D Objects and Scenes

1 code implementation • ECCV 2020 • Ran Song, Wei zhang, Yitian Zhao, Yonghuai Liu

We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named by us view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class.

Object

Paper
Code

Don’t Miss the Labels: Label-semantic Augmented Meta-Learner for Few-Shot Text Classification

1 code implementation • Findings (ACL) 2021 • Qiaoyang Luo, Lingqiao Liu, YuHao Lin, Wei zhang

Few-Shot Text Classification text-classification

Paper
Code

Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

no code implementations • ECNLP (ACL) 2022 • Zheng Liu, Wei zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin Schroeder

Recently, semantic search has been successfully applied to E-commerce product search and the learned semantic space for query and product encoding are expected to generalize well to unseen queries or products.

text similarity

Paper
Add Code

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

no code implementations • 30 May 2024 • Yuxing Duan, Shihan Peng, Lin Zhu, Wei zhang, Yi Chang, Sheng Zhong, Luxin Yan

Event camera has significant advantages in capturing dynamic scene information while being prone to noise interference, particularly in challenging conditions like low threshold and low illumination.

Paper
Add Code

XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser

no code implementations • 27 May 2024 • Xianfu Cheng, Hang Zhang, Jian Yang, Xiang Li, Weixiao Zhou, Kui Wu, Fei Liu, Wei zhang, Tao Sun, Tongliang Li, Zhoujun Li

In the domain of document AI, semi-structured form parsing plays a crucial role.

Paper
Add Code

Functional Programming Paradigm of Python for Scientific Computation Pipeline Integration

no code implementations • 27 May 2024 • Chen Zhang, Lecheng Jia, Wei zhang, Ning Wen

The advent of modern data processing has led to an increasing tendency towards interdisciplinarity, which frequently involves the importation of different technical approaches.

Paper
Add Code

Mamba4KT:An Efficient and Effective Mamba-based Knowledge Tracing Model

no code implementations • 26 May 2024 • Yang Cao, Wei zhang

Recognizing the significance of prioritizing model efficiency and resource usage in knowledge tracing, we introduce Mamba4KT.

Paper
Add Code

ECLIPSE: Semantic Entropy-LCS for Cross-Lingual Industrial Log Parsing

no code implementations • 22 May 2024 • Wei zhang, Xianfu Cheng, Yi Zhang, Jian Yang, Hongcheng Guo, Zhoujun Li, Xiaolin Yin, Xiangyuan Guan, Xu Shi, Liangfan Zheng, Bo Zhang

These challenges are two-fold: 1) massive log templates: The performance and efficiency of most existing parsers will be significantly reduced when logs of growing quantities and different lengths; 2) Complex and changeable semantics: Traditional template-matching algorithms cannot accurately match the log templates of complicated industrial logs because they cannot utilize cross-language logs with similar semantics.

Language Modelling Large Language Model +2

Paper
Add Code

The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

no code implementations • 14 May 2024 • Lingdong Kong, Shaoyuan Xie, Hanjiang Hu, Yaru Niu, Wei Tsang Ooi, Benoit R. Cottereau, Lai Xing Ng, Yuexin Ma, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Weichao Qiu, Wei zhang, Xu Cao, Hao Lu, Ying-Cong Chen, Caixin Kang, Xinning Zhou, Chengyang Ying, Wentao Shang, Xingxing Wei, Yinpeng Dong, Bo Yang, Shengyin Jiang, Zeliang Ma, Dengyi Ji, Haiwen Li, Xingliang Huang, Yu Tian, Genghua Kou, Fan Jia, Yingfei Liu, Tiancai Wang, Ying Li, Xiaoshuai Hao, Yifan Yang, HUI ZHANG, Mengchuan Wei, Yi Zhou, Haimei Zhao, Jing Zhang, Jinke Li, Xiao He, Xiaoqiang Cheng, Bingyang Zhang, Lirong Zhao, Dianlei Ding, Fangsheng Liu, Yixiang Yan, Hongming Wang, Nanfei Ye, Lun Luo, Yubo Tian, Yiwei Zuo, Zhe Cao, Yi Ren, Yunfan Li, Wenjie Liu, Xun Wu, Yifan Mao, Ming Li, Jian Liu, Jiayang Liu, Zihan Qin, Cunxi Chu, Jialei Xu, Wenbo Zhao, Junjun Jiang, Xianming Liu, Ziyan Wang, Chiwei Li, Shilong Li, Chendong Yuan, Songyue Yang, Wentao Liu, Peng Chen, Bin Zhou, YuBo Wang, Chi Zhang, Jianhang Sun, Hai Chen, Xiao Yang, Lizhong Wang, Dongyi Fu, Yongchun Lin, Huitong Yang, Haoang Li, Yadan Luo, Xianjing Cheng, Yong Xu

In the realm of autonomous driving, robust perception under out-of-distribution conditions is paramount for the safe deployment of vehicles.

Autonomous Driving Data Augmentation +3

Paper
Add Code

Weakly-supervised causal discovery based on fuzzy knowledge and complex data complementarity

no code implementations • 14 May 2024 • Wenrui Li, Wei zhang, Qinghao Zhang, Xuegong Zhang, Xiaowo Wang

Extensive experiments with different datasets demonstrate the superiority of KEEL over several state-of-the-art methods in accuracy, robustness and computational efficiency.

Causal Discovery Computational Efficiency

Paper
Add Code

Relating-Up: Advancing Graph Neural Networks through Inter-Graph Relationships

no code implementations • 7 May 2024 • Qi Zou, Na Yu, Daoliang Zhang, Wei zhang, Rui Gao

This module incorporates a relation-aware encoder and a feedback training strategy.

Graph Representation Learning Relation

Paper
Add Code

LingML: Linguistic-Informed Machine Learning for Enhanced Fake News Detection

no code implementations • 7 May 2024 • Jasraj Singh, Fang Liu, Hong Xu, Bee Chin Ng, Wei zhang

In this paper, we enhance ML-based solutions with linguistics input and we propose LingML, linguistic-informed ML, for fake news detection.

Fake News Detection Misinformation

Paper
Add Code

LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation

1 code implementation • 30 Apr 2024 • Lingyi Hong, Zhongying Liu, Wenchao Chen, Chenzhi Tan, Yuang Feng, Xinyu Zhou, Pinxue Guo, Jinglun Li, Zhaoyu Chen, Shuyong Gao, Wei zhang, Wenqiang Zhang

Video object segmentation (VOS) aims to distinguish and track target objects in a video.

Attribute Semantic Segmentation +2

Paper
Code

SLAM for Indoor Mapping of Wide Area Construction Environments

no code implementations • 26 Apr 2024 • Vincent Ress, Wei zhang, David Skuddis, Norbert Haala, Uwe Soergel

We apply our state-of-the-art LiDAR and visual SLAM approaches and discuss the respective pros and cons of the different sensor types for trajectory estimation and dense map generation in such an environment.

Pose Estimation Simultaneous Localization and Mapping

Paper
Add Code

NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

no code implementations • 25 Apr 2024 • Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, HaoNing Wu, Yixuan Gao, Yuqin Cao, ZiCheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng, Jianquan Yang, Weigang Wang, Xi Fang, Xiaoxin Lv, Jun Yan, Tianwu Zhi, Yabin Zhang, Yaohui Li, Yang Li, Jingwen Xu, Jianzhao Liu, Yiting Liao, Junlin Li, Zihao Yu, Yiting Lu, Xin Li, Hossein Motamednia, S. Farhad Hosseini-Benvidi, Fengbin Guan, Ahmad Mahmoudi-Aznaveh, Azadeh Mansouri, Ganzorig Gankhuyag, Kihwan Yoon, Yifang Xu, Haotian Fan, Fangyuan Kong, Shiling Zhao, Weifeng Dong, Haibing Yin, Li Zhu, Zhiling Wang, Bingchen Huang, Avinab Saha, Sandeep Mishra, Shashank Gupta, Rajesh Sureddi, Oindrila Saha, Luigi Celona, Simone Bianco, Paolo Napoletano, Raimondo Schettini, Junfeng Yang, Jing Fu, Wei zhang, Wenzhi Cao, Limei Liu, Han Peng, Weijun Yuan, Zhan Li, Yihang Cheng, Yifan Deng, Haohui Li, Bowen Qu, Yao Li, Shuqing Luo, Shunzhou Wang, Wei Gao, Zihao Lu, Marcos V. Conde, Xinrui Wang, Zhibo Chen, Ruling Liao, Yan Ye, Qiulin Wang, Bing Li, Zhaokun Zhou, Miao Geng, Rui Chen, Xin Tao, Xiaoyu Liang, Shangkun Sun, Xingyuan Ma, Jiaze Li, Mengduo Yang, Haoran Xu, Jie zhou, Shiding Zhu, Bohan Yu, Pengfei Chen, Xinrui Xu, Jiabin Shen, Zhichao Duan, Erfan Asadi, Jiahe Liu, Qi Yan, Youran Qu, Xiaohui Zeng, Lele Wang, Renjie Liao

A total of 196 participants have registered in the video track.

Image Quality Assessment Image Restoration +2

Paper
Add Code

ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

no code implementations • 23 Apr 2024 • Xiao Wang, Aristeidis Tsaris, Siyan Liu, Jong-Youl Choi, Ming Fan, Wei zhang, Junqi Yin, Moetasim Ashfaq, Dan Lu, Prasanna Balaprakash

As the largest model of its kind, ORBIT surpasses the current climate AI foundation model size by a thousandfold.

Data Integration

Paper
Add Code

Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans

no code implementations • 18 Apr 2024 • Lixing Tan, Shuang Song, Kangneng Zhou, Chengbo Duan, Lanying Wang, Huayang Ren, Linlin Liu, Wei zhang, Ruoxiu Xiao

Meanwhile, we also impose a supervised process by computing the similarity of computed real DRR and synthesized DRR images.

Disentanglement Image Generation

Paper
Add Code

Data-free Knowledge Distillation for Fine-grained Visual Categorization

1 code implementation • ICCV 2023 • Renrong Shao, Wei zhang, Jianhua Yin, Jun Wang

Our approach utilizes an adversarial distillation framework with attention generator, mixed high-order attention distillation, and semantic feature contrast learning.

Data-free Knowledge Distillation Fine-Grained Visual Categorization +1

Paper
Code

GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement

no code implementations • 17 Apr 2024 • Linfang Zheng, Tze Ho Elden Tse, Chen Wang, Yinghan Sun, Hua Chen, Ales Leonardis, Wei zhang

Object pose refinement is essential for robust object pose estimation.

Object Pose Estimation

Paper
Add Code

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

no code implementations • 14 Apr 2024 • Lewei Yao, Renjie Pi, Jianhua Han, Xiaodan Liang, Hang Xu, Wei zhang, Zhenguo Li, Dan Xu

This is followed by a fine-tuning stage that leverages a small number of high-resolution samples to further enhance detection performance.

Ranked #2 on Object Detection on ODinW Full-Shot 13 Tasks

Dense Captioning Language Modelling +4

Paper
Add Code

Fast Gradient Computation for Gromov-Wasserstein Distance

no code implementations • 13 Apr 2024 • Wei zhang, ZiHao Wang, Jie Fan, Hao Wu, Yong Zhang

In this way, the original computational bottleneck is broken and the new entropic solution can be obtained with total quadratic time, which is almost optimal complexity.

Paper
Add Code

JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models

no code implementations • 12 Apr 2024 • Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Minfeng Zhu, Wei zhang, Wei Chen

Addressing these concerns necessitates a comprehensive analysis of jailbreak prompts to evaluate LLMs' defensive capabilities and identify potential weaknesses.

Paper
Add Code

Map Optical Properties to Subwavelength Structures Directly via a Diffusion Model

no code implementations • 9 Apr 2024 • Shijie Rao, Kaiyu Cui, Yidong Huang, Jiawei Yang, YaLi Li, Shengjin Wang, Xue Feng, Fang Liu, Wei zhang

The inverse design methods proposed for these subwavelength structures are vital to the development of new photonic devices.

Paper
Add Code

Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS

1 code implementation • 9 Apr 2024 • Afzal Ahmad, Linfeng Du, Zhiyao Xie, Wei zhang

We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins, making it possible to construct realistic NAS benchmarks for large-scale datasets.

Benchmarking Neural Architecture Search

Paper
Code

BatSort: Enhanced Battery Classification with Transfer Learning for Battery Sorting and Recycling

1 code implementation • 8 Apr 2024 • Yunyi Zhao, Wei zhang, Erhai Hu, Qingyu Yan, Cheng Xiang, King Jet Tseng, Dusit Niyato

Battery recycling is a critical process for minimizing environmental harm and resource waste for used batteries.

Transfer Learning

Paper
Code

A diffusion MRI tractography atlas for concurrent white matter mapping across Eastern and Western populations

no code implementations • 6 Apr 2024 • Yijie Li, Wei zhang, Ye Wu, Li Yin, Ce Zhu, Yuqian Chen, Suheyla Cetin-Karayumak, Kang Ik K Cho, Leo R. Zekelman, Jarrett Rushmore, Yogesh Rathi, Nikos Makris, Lauren J. O'Donnell, Fan Zhang

However, a comprehensive investigation into WM fiber tracts between Eastern and Western populations is challenged due to the lack of a cross-population WM atlas and the large site-specific variability of dMRI data.

Paper
Add Code

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

no code implementations • 26 Mar 2024 • Jiacheng Zhang, Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

Additionally, we present a DepthGradient Projection (DGP) module to mitigate optimization conflicts caused by noisy depth supervision of pseudo-labels, effectively decoupling the depth gradient and removing conflicting gradients.

Monocular 3D Object Detection object-detection +1

Paper
Add Code

SFOD: Spiking Fusion Object Detector

1 code implementation • 22 Mar 2024 • Yimeng Fan, Wei zhang, Changsong Liu, Mingyang Li, Wenrui Lu

Thereby, we establish state-of-the-art classification results based on SNNs, achieving 93. 7\% accuracy on the NCAR dataset.

Object object-detection +1

Paper
Code

Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection

1 code implementation • ICCV 2023 • Jiaming Li, Xiangru Lin, Wei zhang, Xiao Tan, YingYing Li, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

To tackle the confirmation bias from incorrect pseudo labels of minority classes, the class-rebalancing sampling module resamples unlabeled data following the guidance of the gradient-based reweighting module.

object-detection Object Detection +1

Paper
Code

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model

no code implementations • 18 Mar 2024 • Runhui Huang, Kaixin Cai, Jianhua Han, Xiaodan Liang, Renjing Pei, Guansong Lu, Songcen Xu, Wei zhang, Hang Xu

Specifically, an inter-layer attention module is designed to encourage information exchange and learning between layers, while a text-guided intra-layer attention module incorporates layer-specific prompts to direct the specific-content generation for each layer.

Image Generation Style Transfer

Paper
Add Code

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation

no code implementations • 18 Mar 2024 • Haochen Jiang, Yueming Xu, Yihan Zeng, Hang Xu, Wei zhang, Jianfeng Feng, Li Zhang

We model the geometric structure of the scene with occupancy representation and distill the pre-trained open vocabulary model into a 3D language field via volume rendering for zero-shot inference.

3D Reconstruction 3D Scene Reconstruction +3

Paper
Add Code

TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models

no code implementations • 17 Mar 2024 • Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Longtao Huang, Hui Xue, Wei zhang

KEPLMs are pre-trained models that utilize external knowledge to enhance language understanding.

Knowledge Graphs Knowledge Probing

Paper
Add Code

Affective Behaviour Analysis via Integrating Multi-Modal Knowledge

no code implementations • 16 Mar 2024 • Wei zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tiancheng Guo, Xin Yu

Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do.

Paper
Add Code

Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

no code implementations • 14 Mar 2024 • Guanghua Li, Wensheng Lu, Wei zhang, Defu Lian, Kezhong Lu, Rui Mao, Kai Shu, Hao Liao

The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large.

Claim Verification Fake News Detection +1

Paper
Add Code

OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework

no code implementations • 13 Mar 2024 • Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei zhang, Wenqiang Zhang

Contemporary Video Object Segmentation (VOS) approaches typically consist stages of feature extraction, matching, memory management, and multiple objects aggregation.

Management Semantic Segmentation +2

Paper
Add Code

Query-guided Prototype Evolution Network for Few-Shot Segmentation

no code implementations • 11 Mar 2024 • Runmin Cong, Hang Xiong, Jinpeng Chen, Wei zhang, Qingming Huang, Yao Zhao

To address this, we present the Query-guided Prototype Evolution Network (QPENet), a new method that integrates query features into the generation process of foreground and background prototypes, thereby yielding customized prototypes attuned to specific queries.

Segmentation

Paper
Add Code

ClickVOS: Click Video Object Segmentation

no code implementations • 10 Mar 2024 • Pinxue Guo, Lingyi Hong, Xinyu Zhou, Shuyong Gao, Wanyun Li, Jinglun Li, Zhaoyu Chen, Xiaoqiang Li, Wei zhang, Wenqiang Zhang

To address these limitations, we propose the setting named Click Video Object Segmentation (ClickVOS) which segments objects of interest across the whole video according to a single click per object in the first frame.

Object Segmentation +3

Paper
Add Code

Aligning Large Language Models for Controllable Recommendations

no code implementations • 8 Mar 2024 • Wensheng Lu, Jianxun Lian, Wei zhang, Guanghua Li, Mingyang Zhou, Hao Liao, Xing Xie

Inspired by the exceptional general intelligence of Large Language Models (LLMs), researchers have begun to explore their application in pioneering the next generation of recommender systems - systems that are conversational, explainable, and controllable.

Recommendation Systems

Paper
Add Code

Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery

no code implementations • 6 Mar 2024 • Wei zhang, Miaoxin Cai, Tong Zhang, Guoqiang Lei, Yin Zhuang, Xuerui Mao

Ship detection needs to identify ship locations from remote sensing (RS) scenes.

Language Modelling

Paper
Add Code

SMAUG: A Sliding Multidimensional Task Window-Based MARL Framework for Adaptive Real-Time Subtask Recognition

no code implementations • 4 Mar 2024 • Wenjing Zhang, Wei zhang

Instead of making behavioral decisions directly from the exponentially expanding joint observational-action space, subtask-based multi-agent reinforcement learning (MARL) methods enable agents to learn how to tackle different subtasks.

Hierarchical Reinforcement Learning Multi-agent Reinforcement Learning +4

Paper
Add Code

Explainable Session-based Recommendation via Path Reasoning

no code implementations • 28 Feb 2024 • Yang Cao, Shuo Shang, Jun Wang, Wei zhang

This paper explores providing explainability for session-based recommendation (SR) by path reasoning.

Hierarchical Reinforcement Learning Knowledge Graphs +1

Paper
Add Code

Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging

1 code implementation • 28 Feb 2024 • Wei zhang, Hongcheng Guo, Anjie Le, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Shi Xu, Runqiang Zang, Liangfan Zheng, Bo Zhang

Log parsing, which entails transforming raw log messages into structured templates, constitutes a critical phase in the automation of log analytics.

Log Parsing

Paper
Code

StaPep: an open-source tool for the structure prediction and feature extraction of hydrocarbon-stapled peptides

1 code implementation • 28 Feb 2024 • Zhe Wang, Jianping Wu, Mengjun Zheng, Chenchen Geng, Borui Zhen, Wei zhang, Hui Wu, Zhengyang Xu, Gang Xu, Si Chen, Xiang Li

Many tools exist for extracting structural and physiochemical descriptors from linear peptides to predict their properties, but similar tools for hydrocarbon-stapled peptides are lacking. Here, we present StaPep, a Python-based toolkit designed for generating 2D/3D structures and calculating 21 distinct features for hydrocarbon-stapled peptides. The current version supports hydrocarbon-stapled peptides containing 2 non-standard amino acids (norleucine and 2-aminoisobutyric acid) and 6 nonnatural anchoring residues (S3, S5, S8, R3, R5 and R8). Then we established a hand-curated dataset of 201 hydrocarbon-stapled peptides and 384 linear peptides with sequence information and experimental membrane permeability, to showcase StaPep's application in artificial intelligence projects. A machine learning-based predictor utilizing above calculated features was developed with AUC of 0. 85, for identifying cell-penetrating hydrocarbon-stapled peptides. StaPep's pipeline spans data retrieval, cleaning, structure generation, molecular feature calculation, and machine learning model construction for hydrocarbon-stapled peptides. The source codes and dataset are freely available on Github: https://github. com/dahuilangda/stapep_package.

Retrieval

Paper
Code

Don't Forget Your Reward Values: Language Model Alignment via Value-based Calibration

1 code implementation • 25 Feb 2024 • Xin Mao, Feng-Lin Li, Huimin Xu, Wei zhang, Anh Tuan Luu

While Reinforcement Learning from Human Feedback (RLHF) significantly enhances the generation quality of Large Language Models (LLMs), recent studies have raised concerns regarding the complexity and instability associated with the Proximal Policy Optimization (PPO) algorithm, proposing a series of order-based calibration methods as viable alternatives.

Language Modelling

Paper
Code

Reading Relevant Feature from Global Representation Memory for Visual Object Tracking

no code implementations • NeurIPS 2023 • Xinyu Zhou, Pinxue Guo, Lingyi Hong, Jinglun Li, Wei zhang, Weifeng Ge, Wenqiang Zhang

Therefore, using all features in the template and memory can lead to redundancy and impair tracking performance.

Visual Object Tracking

Paper
Add Code

Do Large Language Models Understand Logic or Just Mimick Context?

no code implementations • 19 Feb 2024 • Junbing Yan, Chengyu Wang, Jun Huang, Wei zhang

Over the past few years, the abilities of large language models (LLMs) have received extensive attention, which have performed exceptionally well in complicated scenarios such as logical reasoning and symbolic inference.

counterfactual In-Context Learning +1

Paper
Add Code

Gaussian Process Neural Additive Models

1 code implementation • 19 Feb 2024 • Wei zhang, Brian Barr, John Paisley

Deep neural networks have revolutionized many fields, but their black-box nature also occasionally prevents their wider adoption in fields such as healthcare and finance, where interpretable and explainable models are required.

Additive models Explainable Models

Paper
Code

Pattern-wise Transparent Sequential Recommendation

no code implementations • 18 Feb 2024 • Kun Ma, Cong Xu, Zeyuan Chen, Wei zhang

However, achieving both model transparency and recommendation performance simultaneously is challenging, especially for models that take the entire sequence of items as input without screening.

Decision Making Sequential Recommendation

Paper
Add Code

Random Projection Layers for Multidimensional Time Series Forecasting

no code implementations • 16 Feb 2024 • Chin-Chia Michael Yeh, Yujie Fan, Xin Dai, Vivian Lai, Prince Osei Aboagye, Junpeng Wang, Huiyuan Chen, Yan Zheng, Zhongfang Zhuang, Liang Wang, Wei zhang

All-Multi-Layer Perceptron (all-MLP) mixer models have been shown to be effective for time series forecasting problems.

Time Series Time Series Forecasting

Paper
Add Code

Translating Images to Road Network:A Non-Autoregressive Sequence-to-Sequence Approach

2 code implementations • 13 Feb 2024 • Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei zhang, Li Zhang

Instead, our work establishes a unified representation of both types of data domain by projecting both Euclidean and non-Euclidean data into an integer series called RoadNet Sequence.

Paper
Code

Understanding the Role of Cross-Entropy Loss in Fairly Evaluating Large Language Model-based Recommendation

no code implementations • 9 Feb 2024 • Cong Xu, Zhangchi Zhu, Jun Wang, Jianyong Wang, Wei zhang

Large language models (LLMs) have gained much attention in the recommendation community; some studies have observed that LLMs, fine-tuned by the cross-entropy loss with a full softmax, could achieve state-of-the-art performance already.

Language Modelling Large Language Model

Paper
Add Code

LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement

2 code implementations • 31 Jan 2024 • Renyuan Peng, Xinyue Cai, Hang Xu, Jiachen Lu, Feng Wen, Wei zhang, Li Zhang

Accurate extraction of lane graphs relies on precisely estimating vertex and edge information within the DAG.

Autonomous Driving Language Modelling

Paper
Code

EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing Domain

1 code implementation • 30 Jan 2024 • Wei zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao

Multi-modal large language models (MLLMs) have demonstrated remarkable success in vision and visual-language tasks within the natural image domain.

Image Comprehension Instruction Following +2

Paper
Code

Fortifying Ethical Boundaries in AI: Advanced Strategies for Enhancing Security in Large Language Models

no code implementations • 27 Jan 2024 • Yunhong He, Jianling Qiu, Wei zhang, Zhengqing Yuan

Recent advancements in large language models (LLMs) have significantly enhanced capabilities in natural language processing and artificial intelligence.

Question Answering Text Generation

Paper
Add Code

Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities

1 code implementation • 16 Jan 2024 • Xu Yan, Haiming Zhang, Yingjie Cai, Jingming Guo, Weichao Qiu, Bin Gao, Kaiqiang Zhou, Yue Zhao, Huan Jin, Jiantao Gao, Zhen Li, Lihui Jiang, Wei zhang, Hongbo Zhang, Dengxin Dai, Bingbing Liu

The rise of large foundation models, trained on extensive datasets, is revolutionizing the field of AI.

Autonomous Driving

198

Paper
Code

PUPAE: Intuitive and Actionable Explanations for Time Series Anomalies

no code implementations • 16 Jan 2024 • Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn J. Keogh

In this work we introduce a domain agnostic counterfactual explanation technique to produce explanations for time series anomalies.

Anomaly Detection counterfactual +3

Paper
Add Code

Contrastive Learning with Negative Sampling Correction

no code implementations • 13 Jan 2024 • Lu Wang, Chao Du, Pu Zhao, Chuan Luo, Zhangchi Zhu, Bo Qiao, Wei zhang, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

To correct the negative sampling bias, we propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL).

Contrastive Learning Data Augmentation +2

Paper
Add Code

Fine-Grained Embedding Dimension Optimization During Training for Recommender Systems

no code implementations • 9 Jan 2024 • Qinyi Luo, Penghan Wang, Wei zhang, Fan Lai, Jiachen Mao, Xiaohan Wei, Jun Song, Wei-Yu Tsai, Shuai Yang, Yuxi Hu, Xuehai Qian

Huge embedding tables in modern Deep Learning Recommender Models (DLRM) require prohibitively large memory during training and inference.

Click-Through Rate Prediction Recommendation Systems

Paper
Add Code

Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

no code implementations • 9 Jan 2024 • Xuzheng Yu, Chen Jiang, Wei zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu

With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important.

Representation Learning Scene Recognition

Paper
Add Code

Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach

no code implementations • 2 Jan 2024 • Prince Aboagye, Yan Zheng, Junpeng Wang, Uday Singh Saini, Xin Dai, Michael Yeh, Yujie Fan, Zhongfang Zhuang, Shubham Jain, Liang Wang, Wei zhang

The emergence of pre-trained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets.

Paper
Add Code

Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models

1 code implementation • 2 Jan 2024 • Xinpeng Ding, Jinahua Han, Hang Xu, Xiaodan Liang, Wei zhang, Xiaomeng Li

BEV-InMLLM integrates multi-view, spatial awareness, and temporal semantics to enhance MLLMs' capabilities on NuInstruct tasks.

Autonomous Driving

Paper
Code

Towards a Simultaneous and Granular Identity-Expression Control in Personalized Face Generation

no code implementations • 2 Jan 2024 • Renshuai Liu, Bowen Ma, Wei zhang, Zhipeng Hu, Changjie Fan, Tangjie Lv, Yu Ding, Xuan Cheng

We devise a novel diffusion model that can undertake the task of simultaneously face swapping and reenactment.

Face Generation Face Reenactment +1

Paper
Add Code

Enhancing dysarthria speech feature representation with empirical mode decomposition and Walsh-Hadamard transform

no code implementations • 30 Dec 2023 • Ting Zhu, Shufei Duan, Camille Dingam, HuiZhi Liang, Wei zhang

This algorithm effectively addresses the challenges of the imbalanced dataset and non-linearity in dysarthric speech and simultaneously provides a robust representation of the local pathological features of the vocal folds and tracts.

imbalanced classification

Paper
Add Code

Symbolic Cognitive Diagnosis via Hybrid Optimization for Intelligent Education Systems

1 code implementation • 30 Dec 2023 • Junhao Shen, Hong Qian, Wei zhang, Aimin Zhou

The SCD framework incorporates the symbolic tree to explicably represent the complicated student-exercise interaction function, and utilizes gradient-based optimization methods to effectively learn the student and exercise parameters.

Attribute cognitive diagnosis

Paper
Code

PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion

no code implementations • 27 Dec 2023 • Guansong Lu, Yuanfan Guo, Jianhua Han, Minzhe Niu, Yihan Zeng, Songcen Xu, Zeyi Huang, Zhao Zhong, Wei zhang, Hang Xu

Current large-scale diffusion models represent a giant leap forward in conditional image synthesis, capable of interpreting diverse cues like text, human poses, and edges.

Computational Efficiency Denoising +1

Paper
Add Code

SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance Fields

no code implementations • 26 Dec 2023 • Kaichen Zhou, Lanqing Hong, Enze Xie, Yongxin Yang, Zhenguo Li, Wei zhang

Although significant progress has been made in the field of 2D-based interactive editing, fine-grained 3D-based interactive editing remains relatively unexplored.

Interactive Segmentation Segmentation

Paper
Add Code

Automatic bony structure segmentation and curvature estimation on ultrasound cervical spine images -- a feasibility study

no code implementations • 19 Dec 2023 • Songhan Ge, Haoyuan Tian, Wei zhang, Rui Zheng

In this study, a portable ultrasound imaging system was applied to acquire cervical spine image volume.

Paper
Add Code

DMT: Comprehensive Distillation with Multiple Self-supervised Teachers

no code implementations • 19 Dec 2023 • Yuang Liu, Jing Wang, Qiang Zhou, Fan Wang, Jun Wang, Wei zhang

Numerous self-supervised learning paradigms, such as contrastive learning and masked image modeling, have been proposed to acquire powerful and general representations from unlabeled data.

Contrastive Learning Model Compression +1

Paper
Add Code

Design, construction and evaluation of emotional multimodal pathological speech database

no code implementations • 14 Dec 2023 • Ting Zhu, Shufei Duan, HuiZhi Liang, Wei zhang

The automatic recognition tested on speech and glottal data, with average accuracy of 78% for controls and 60% for patients in audio, while 51% for controls and 38% for patients in glottal data, indicating an influence of the disease on emotional expression.

Paper
Add Code

Native Language Identification with Large Language Models

no code implementations • 13 Dec 2023 • Wei zhang, Alexandre Salle

We present the first experiments on Native Language Identification (NLI) using LLMs such as GPT-4.

Language Acquisition Native Language Identification

Paper
Add Code

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

1 code implementation • 5 Dec 2023 • Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei zhang, LiMin Wang

Now text-to-image foundation models are widely applied to various downstream image synthesis tasks, such as controllable image generation and image editing, while downstream video synthesis tasks are less explored for several reasons.

Image Generation Model Selection +3

Paper
Code

Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning

1 code implementation • 1 Dec 2023 • Jiajun Cui, Minghe Yu, Bo Jiang, Aimin Zhou, Jianyong Wang, Wei zhang

Knowledge tracing (KT) plays a crucial role in computer-aided education and intelligent tutoring systems, aiming to assess students' knowledge proficiency by predicting their future performance on new questions based on their past response records.

counterfactual Counterfactual Reasoning +1

Paper
Code

SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation

no code implementations • 30 Nov 2023 • Lingyi Hong, Wei zhang, Shuyong Gao, Hong Lu, Wenqiang Zhang

We evaluate our method on several benchmark datasets and achieve state-of-the-art results.

Object object-detection +6

Paper
Add Code

Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper

no code implementations • 22 Nov 2023 • Chengyu Wang, Junbing Yan, Wei zhang, Jun Huang

This paper delves into the pressing need in Parameter-Efficient Fine-Tuning (PEFT) for Large Language Models (LLMs).

Model Compression Position

Paper
Add Code

Soft Random Sampling: A Theoretical and Empirical Analysis

no code implementations • 21 Nov 2023 • Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei zhang, George Saon, Brian Kingsbury

Soft random sampling (SRS) is a simple yet effective approach for efficient training of large-scale deep neural networks when dealing with massive data.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Multi-Resolution Planar Region Extraction for Uneven Terrains

no code implementations • 21 Nov 2023 • Yinghan Sun, Linfang Zheng, Hua Chen, Wei zhang

This paper studies the problem of extracting planar regions in uneven terrains from unordered point cloud measurements.

Computational Efficiency

Paper
Add Code

MS-Former: Memory-Supported Transformer for Weakly Supervised Change Detection with Patch-Level Annotations

1 code implementation • 16 Nov 2023 • Zhenglai Li, Chang Tang, Xinwang Liu, Changdong Li, Xianju Li, Wei zhang

How to capture the semantic variations associated with the changed and unchanged regions from the patch-level annotations to obtain promising change results is the critical challenge for the weakly supervised change detection task.

Change Detection

Paper
Code

Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta

no code implementations • 16 Nov 2023 • Wei zhang, Dai Li, Chen Liang, Fang Zhou, Zhongke Zhang, Xuewei Wang, Ru Li, Yi Zhou, Yaning Huang, Dong Liang, Kai Wang, Zhangyuan Wang, Zhengxing Chen, Fenggang Wu, Minghai Chen, Huayu Li, Yunnan Wu, Zhan Shu, Mindi Yuan, Sri Reddy

To address these challenges, we present Scaling User Modeling (SUM), a framework widely deployed in Meta's ads ranking system, designed to facilitate efficient and scalable sharing of online user representation across hundreds of ads models.

Representation Learning

Paper
Add Code

From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with Small Language Models

no code implementations • 12 Nov 2023 • Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei zhang

Reasoning is a distinctive human capacity, enabling us to address complex problems by breaking them down into a series of manageable cognitive steps.

Language Modelling Logical Reasoning

Paper
Add Code

VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis

no code implementations • 9 Nov 2023 • Sen Wang, Wei zhang, Stefano Gasperini, Shun-Cheng Wu, Nassir Navab

Creating high-quality view synthesis is essential for immersive applications but continues to be problematic, particularly in indoor environments and for real-time deployment.

Paper
Add Code

From Input to Output: A Multi-layer Knowledge Distillation Framework for Compressing Recommendation Models

no code implementations • 8 Nov 2023 • Zhangchi Zhu, Wei zhang

In this paper, we decompose recommendation models into three layers, i. e., the input layer, the intermediate layer, and the output layer, and address deficiencies layer by layer.

Knowledge Distillation

Paper
Add Code

Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation

1 code implementation • 7 Nov 2023 • Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Wei zhang, Si Qin, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

To address these limitations, we introduce a novel thought prompting approach called "Everything of Thoughts" (XoT) to defy the law of "Penrose triangle of existing thought paradigms.

Decision Making

Paper
Code

Temporal Treasure Hunt: Content-based Time Series Retrieval System for Discovering Insights

no code implementations • 5 Nov 2023 • Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Yujie Fan, Vivian Lai, Junpeng Wang, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang

To facilitate this investigation, we introduce a CTSR benchmark dataset that comprises time series data from a variety of domains, such as motion, power demand, and traffic.

Retrieval Time Series +1

Paper
Add Code

Sketching Multidimensional Time Series for Fast Discord Mining

no code implementations • 5 Nov 2023 • Chin-Chia Michael Yeh, Yan Zheng, Menghai Pan, Huiyuan Chen, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei zhang, Jeff M. Phillips, Eamonn Keogh

In this work, we propose a sketch for discord mining among multi-dimensional time series.

Anomaly Detection Time Series +1

Paper
Add Code

Ego-Network Transformer for Subsequence Classification in Time Series Data

no code implementations • 5 Nov 2023 • Chin-Chia Michael Yeh, Huiyuan Chen, Yujie Fan, Xin Dai, Yan Zheng, Vivian Lai, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn Keogh

The ego-networks of all subsequences collectively form a time series subsequence graph, and we introduce an algorithm to efficiently construct this graph.

Time Series Time Series Classification

Paper
Add Code

Time Series Synthesis Using the Matrix Profile for Anonymization

no code implementations • 5 Nov 2023 • Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Huiyuan Chen, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn Keogh

As a result, unmodified data mining tools can obtain near-identical performance on the synthesized time series as on the original time series.

Time Series

Paper
Add Code

Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning

no code implementations • 2 Nov 2023 • Yiran Li, Junpeng Wang, Prince Aboagye, Michael Yeh, Yan Zheng, Liang Wang, Wei zhang, Kwan-Liu Ma

On the one hand, by visually examining the captions automatically generated from language-image models for an image dataset, we gain deeper insights into the semantic underpinnings of the visual contents, unearthing data biases that may be entrenched within the dataset.

Caption Generation Efficient Exploration +1

Paper
Add Code

Lightweight super resolution network for point cloud geometry compression

1 code implementation • 2 Nov 2023 • Wei zhang, Dingquan Li, Ge Li, Wen Gao

This paper presents an approach for compressing point cloud geometry by leveraging a lightweight super-resolution network.

Decoder Point cloud reconstruction +2

Paper
Code

PET Tracer Conversion among Brain PET via Variable Augmented Invertible Network

no code implementations • 1 Nov 2023 • Bohui Shen, Wei zhang, Xubiao Liu, Pengfei Yu, Shirui Jiang, Xinchong Shi, Xiangsong Zhang, Xiaoyu Zhou, Weirui Zhang, Bingxuan Li, Qiegen Liu

Meanwhile, the invertible network iteratively estimates the resultant DOPA PET data and compares it to the reference DOPA PET data.

Image Registration

Paper
Add Code

BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities

1 code implementation • 23 Oct 2023 • Binyu Zhao, Wei zhang, Zhaonian Zou

Collaborative perception enables agents to share complementary perceptual information with nearby agents.

Autonomous Driving

Paper
Code

Learning Interpretable Rules for Scalable Data Representation and Classification

1 code implementation • 22 Oct 2023 • Zhuo Wang, Wei zhang, Ning Liu, Jianyong Wang

Rule-based models, e. g., decision trees, are widely used in scenarios demanding high model interpretability for their transparent inner structures and good model expressivity.

Classification

Paper
Code

FATA-Trans: Field And Time-Aware Transformer for Sequential Tabular Data

1 code implementation • 20 Oct 2023 • Dongyu Zhang, Liang Wang, Xin Dai, Shubham Jain, Junpeng Wang, Yujie Fan, Chin-Chia Michael Yeh, Yan Zheng, Zhongfang Zhuang, Wei zhang

FATA-Trans is field- and time-aware for sequential tabular data.

Language Modelling Masked Language Modeling

Paper
Code

Parallel compressive super-resolution imaging with wide field-of-view based on physics enhanced network

no code implementations • 20 Oct 2023 • Xiao-Peng Jin, An-Dong Xiong, Wei zhang, Xiao-Qing Wang, Fan Liu, Chang-Heng Li, Xu-Ri Yao, Xue-Feng Liu, Qing Zhao

By training the network with the prior OTF of an arbitrary 128x128-pixel region and fine-tuning the network with other OTFs within rest regions of FOV, we realize both mask optimization and super-resolution imaging with up to 1020x1500 wide FOV.

Super-Resolution

Paper
Add Code

Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks

no code implementations • 15 Oct 2023 • Ziqiang Li, Pengfei Xia, Hong Sun, Yueqi Zeng, Wei zhang, Bin Li

In this study, we focus on improving the poisoning efficiency of backdoor attacks from the sample selection perspective.

Audio Classification Image Classification +2

Paper
Add Code

BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations

1 code implementation • 11 Oct 2023 • Qizhi Pei, Wei zhang, Jinhua Zhu, Kehan Wu, Kaiyuan Gao, Lijun Wu, Yingce Xia, Rui Yan

Recent advancements in biological research leverage the integration of molecules, proteins, and natural language to enhance drug discovery.

Ranked #2 on Text-based de novo Molecule Generation on ChEBI-20

Molecule Captioning Text-based de novo Molecule Generation

Paper
Code

HI-SLAM: Monocular Real-time Dense Mapping with Hybrid Implicit Fields

no code implementations • 7 Oct 2023 • Wei zhang, Tiecheng Sun, Sen Wang, Qing Cheng, Norbert Haala

For global consistency, we propose an efficient Sim(3)-based pose graph bundle adjustment (PGBA) approach to run online loop closing and mitigate the pose and scale drift.

Simultaneous Localization and Mapping

Paper
Add Code

Toward a Foundation Model for Time Series Data

no code implementations • 5 Oct 2023 • Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Audrey Der, Vivian Lai, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei zhang

A foundation model is a machine learning model trained on a large and diverse set of data, typically using self-supervised learning-based pre-training techniques, that can be adapted to various downstream tasks.

Self-Supervised Learning Time Series

Paper
Add Code

An Efficient Content-based Time Series Retrieval System

no code implementations • 5 Oct 2023 • Chin-Chia Michael Yeh, Huiyuan Chen, Xin Dai, Yan Zheng, Junpeng Wang, Vivian Lai, Yujie Fan, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang, Jeff M. Phillips

A Content-based Time Series Retrieval (CTSR) system is an information retrieval system for users to interact with time series emerged from multiple domains, such as finance, healthcare, and manufacturing.

Information Retrieval Retrieval +1

Paper
Add Code

Multitask Learning for Time Series Data with 2D Convolution

no code implementations • 5 Oct 2023 • Chin-Chia Michael Yeh, Xin Dai, Yan Zheng, Junpeng Wang, Huiyuan Chen, Yujie Fan, Audrey Der, Zhongfang Zhuang, Liang Wang, Wei zhang

In this paper, we investigate the application of MTL to the time series classification (TSC) problem.

Dynamic Time Warping Recommendation Systems +2

Paper
Add Code

Feature Interaction Aware Automated Data Representation Transformation

1 code implementation • 29 Sep 2023 • Ehtesamul Azim, Dongjie Wang, Kunpeng Liu, Wei zhang, Yanjie Fu

Creating an effective representation space is crucial for mitigating the curse of dimensionality, enhancing model generalization, addressing data sparsity, and leveraging classical models more effectively.

Automated Feature Engineering Decision Making +4

Paper
Code

Revealing the Power of Spatial-Temporal Masked Autoencoders in Multivariate Time Series Forecasting

no code implementations • 26 Sep 2023 • Jiarui Sun, Yujie Fan, Chin-Chia Michael Yeh, Wei zhang, Girish Chowdhary

To address these issues, we propose Spatial-Temporal Masked Autoencoders (STMAE), an MTS forecasting framework that leverages masked autoencoders to enhance the performance of spatial-temporal baseline models.

Decoder Multivariate Time Series Forecasting +1

Paper
Add Code

Graph-enhanced Optimizers for Structure-aware Recommendation Embedding Evolution

no code implementations • 24 Sep 2023 • Cong Xu, Jun Wang, Jianyong Wang, Wei zhang

Embedding plays a critical role in modern recommender systems because they are virtual representations of real-world entities and the foundation for subsequent decision models.

Recommendation Systems

Paper
Add Code

PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

1 code implementation • 21 Sep 2023 • Shilin Yan, Xiaohao Xu, Renrui Zhang, Lingyi Hong, Wenchao Chen, Wenqiang Zhang, Wei zhang

Our dataset poses new challenges in panoramic VOS and we hope that our PanoVOS can advance the development of panoramic segmentation/tracking.

Autonomous Driving Segmentation +4

Paper
Code

Multi-view Fuzzy Representation Learning with Rules based Model

2 code implementations • 20 Sep 2023 • Wei zhang, Zhaohong Deng, Te Zhang, Kup-Sze Choi, Shitong Wang

Second, a new regularization method based on L_(2, 1)-norm regression is proposed to mine the consistency information between views, while the geometric structure of the data is preserved through the Laplacian graph.

Representation Learning

Paper
Code

Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval

no code implementations • 20 Sep 2023 • Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu

SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.

Retrieval Video Retrieval

Paper
Add Code

An Empirical Study of Attention Networks for Semantic Segmentation

no code implementations • 19 Sep 2023 • Hao Guo, Hongbiao Si, Guilin Jiang, Wei zhang, Zhiyan Liu, Xuanyi Zhu, xulong Zhang, Yang Liu

What's more, various methods utilize attention in semantic segmentation, but the conclusion of these methods is lacking.

Segmentation Semantic Segmentation

Paper
Add Code

SoccerNet 2023 Challenges Results

2 code implementations • 12 Sep 2023 • Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

Paper
Code

HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving

no code implementations • 11 Sep 2023 • Xinpeng Ding, Jianhua Han, Hang Xu, Wei zhang, Xiaomeng Li

For the first time, we leverage singular multimodal large language models (MLLMs) to consolidate multiple autonomous driving tasks from videos, i. e., the Risk Object Localization and Intention and Suggestion Prediction (ROLISP) task.

Autonomous Driving Object Localization

Paper
Add Code

Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation

no code implementations • 7 Sep 2023 • Jiaxi Gu, Shicong Wang, Haoyu Zhao, Tianyi Lu, Xing Zhang, Zuxuan Wu, Songcen Xu, Wei zhang, Yu-Gang Jiang, Hang Xu

Conditioned on an initial video clip with a small number of frames, additional frames are iteratively generated by reusing the original latent features and following the previous diffusion process.

Action Recognition Decoder +4

Paper
Add Code

Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

no code implementations • ICCV 2023 • Cuican Yu, Guansong Lu, Yihan Zeng, Jian Sun, Xiaodan Liang, Huibin Li, Zongben Xu, Songcen Xu, Wei zhang, Hang Xu

In this paper, we propose a text-guided 3D faces generation method, refer as TG-3DFace, for generating realistic 3D faces using text guidance.

3D Shape Generation Contrastive Learning +2

Paper
Add Code

Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models

1 code implementation • 27 Aug 2023 • Kaiyuan Gao, Sunan He, Zhenyu He, Jiacheng Lin, Qizhi Pei, Jie Shao, Wei zhang

Generative pre-trained transformer (GPT) models have revolutionized the field of natural language processing (NLP) with remarkable performance in various tasks and also extend their power to multimodal domains.

Paper
Code

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks

1 code implementation • 22 Aug 2023 • Xi Xie, Hongwu Peng, Amit Hasan, Shaoyi Huang, Jiahui Zhao, Haowen Fang, Wei zhang, Tong Geng, Omer Khan, Caiwen Ding

Utilizing these principles, we formulated a kernel for sparse matrix multiplication (SpMM) in GCNs that employs block-level partitioning and combined warp strategy.

Computational Efficiency

Paper
Code

GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training

no code implementations • ICCV 2023 • Xinchi Deng, Han Shi, Runhui Huang, Changlin Li, Hang Xu, Jianhua Han, James Kwok, Shen Zhao, Wei zhang, Xiaodan Liang

Compared with the existing methods, GrowCLIP improves 2. 3% average top-1 accuracy on zero-shot image classification of 9 downstream tasks.

Image Classification Image Retrieval +2

Paper
Add Code

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability

no code implementations • ICCV 2023 • Runhui Huang, Jianhua Han, Guansong Lu, Xiaodan Liang, Yihan Zeng, Wei zhang, Hang Xu

DiffDis first formulates the image-text discriminative problem as a generative diffusion process of the text embedding from the text encoder conditioned on the image.

Image Generation Zero-Shot Learning

Paper
Add Code

Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection

2 code implementations • 17 Aug 2023 • Runmin Cong, Hongyu Liu, Chen Zhang, Wei zhang, Feng Zheng, Ran Song, Sam Kwong

By integrating complementary information from RGB image and depth map, the ability of salient object detection (SOD) for complex and challenging scenes can be improved.

object-detection RGB-D Salient Object Detection +1

Paper
Code

SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

1 code implementation • 17 Aug 2023 • Runmin Cong, Yuchen Guan, Jinpeng Chen, Wei zhang, Yao Zhao, Sam Kwong

Despite significant progress in shadow detection, current methods still struggle with the adverse impact of background color, which may lead to errors when shadows are present on complex backgrounds.

Disentanglement Shadow Detection

Paper
Code

Frequency Perception Network for Camouflaged Object Detection

2 code implementations • 17 Aug 2023 • Runmin Cong, Mengyao Sun, Sanyi Zhang, Xiaofei Zhou, Wei zhang, Yao Zhao

Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment.

Object object-detection +1

Paper
Code

MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for Long-tailed Semantic Segmentation

no code implementations • 16 Aug 2023 • Junao Shen, Long Chen, Kun Kuang, Fei Wu, Tian Feng, Wei zhang

The proposed two-sage framework comprises a multi-expert decoder (MED) and a multi-expert output ensemble (MOE).

Decoder Segmentation +1

Paper
Add Code

Beyond Semantics: Learning a Behavior Augmented Relevance Model with Self-supervised Learning

1 code implementation • 10 Aug 2023 • Zeyuan Chen, Wei Chen, Jia Xu, Zhongyi Liu, Wei zhang

Drawing inspiration from this, we devise a novel Behavior Augmented Relevance Learning model for Alipay Search (BARL-ASe) that leverages neighbor queries of target item and neighbor items of target query to complement target query-item semantic matching.

Self-Supervised Learning Semantic Similarity +1

Paper
Code

Gaussian-based Probabilistic Deep Supervision Network for Noise-Resistant QoS Prediction

no code implementations • 3 Aug 2023 • Ziliang Wang, Xiaohong Zhang, Sheng Huang, Wei zhang, Dan Yang, Meng Yan

Quality of Service (QoS) prediction is an essential task in recommendation systems, where accurately predicting unknown QoS values can improve user satisfaction.

Recommendation Systems

Paper
Add Code

Dynamic Token-Pass Transformers for Semantic Segmentation

no code implementations • 3 Aug 2023 • Yuang Liu, Qiang Zhou, Jing Wang, Fan Wang, Jun Wang, Wei zhang

Vision transformers (ViT) usually extract features via forwarding all the tokens in the self-attention layers from top to toe.

Segmentation Semantic Segmentation

Paper
Add Code

EmbeddingTree: Hierarchical Exploration of Entity Features in Embedding

no code implementations • 2 Aug 2023 • Yan Zheng, Junpeng Wang, Chin-Chia Michael Yeh, Yujie Fan, Huiyuan Chen, Liang Wang, Wei zhang

The tool helps users discover nuance features of data entities, perform feature denoising/injecting in embedding training, and generate embeddings for unseen entities.

Denoising

Paper
Add Code

Knowledge-aware Collaborative Filtering with Pre-trained Language Model for Personalized Review-based Rating Prediction

1 code implementation • 2 Aug 2023 • Quanxiu Wang, Xinlei Cao, Jianyong Wang, Wei zhang

For the first issue, to utilize rich knowledge, KCF-PLM develops a transformer network to model the interactions of the extracted aspects w. r. t.

Collaborative Filtering Language Modelling

Paper
Code

Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

1 code implementation • 1 Aug 2023 • Zhangchi Zhu, Lu Wang, Pu Zhao, Chao Du, Wei zhang, Hang Dong, Bo Qiao, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first.

Paper
Code

Computational Approaches for Traditional Chinese Painting: From the "Six Principles of Painting" Perspective

no code implementations • 26 Jul 2023 • Wei zhang, Jian-Wei Zhang, Kam Kwai Wong, Yifang Wang, Yingchaojie Feng, Luwei Wang, Wei Chen

Second, we created a four-stage framework to illustrate the purposes of TCP applications.

Paper
Add Code

Learning and Evaluating Human Preferences for Conversational Head Generation

no code implementations • 20 Jul 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

In this paper, we propose a novel learning-based evaluation metric named Preference Score (PS) for fitting human preference according to the quantitative evaluations across different dimensions.

Paper
Add Code

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

3 code implementations • CVPR 2023 • Jiacheng Zhang, Xiangru Lin, Wei zhang, Kuo Wang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

Specifically, we propose a Stage-wise Hybrid Matching strategy that combines the one-to-many assignment and one-to-one assignment strategies to improve the training efficiency of the first stage and thus provide high-quality pseudo labels for the training of the second stage.

Ranked #1 on Semi-Supervised Object Detection on COCO 5% labeled data

Object object-detection +3

12,218

Paper
Code

Visual Analytics For Machine Learning: A Data Perspective Survey

no code implementations • 15 Jul 2023 • Junpeng Wang, Shixia Liu, Wei zhang

The past decade has witnessed a plethora of works that leverage the power of visualization (VIS) to interpret machine learning (ML) models.

Paper
Add Code

Contrastive Graph Pooling for Explainable Classification of Brain Networks

1 code implementation • 7 Jul 2023 • Jiaxing Xu, Qingtian Bian, Xinhang Li, Aihu Zhang, Yiping Ke, Miao Qiao, Wei zhang, Wei Khang Jeremy Sim, Balázs Gulyás

Our contributions underscore the potential of ContrastPool for advancing the understanding of brain networks and neurodegenerative conditions.

Classification

Paper
Code

CityTrack: Improving City-Scale Multi-Camera Multi-Target Tracking by Location-Aware Tracking and Box-Grained Matching

no code implementations • 6 Jul 2023 • Jincheng Lu, Xipeng Yang, Jin Ye, Yifu Zhang, Zhikang Zou, Wei zhang, Xiao Tan

Targets in urban traffic scenes often undergo occlusion, illumination changes, and perspective changes, making it difficult to associate targets across different cameras accurately.

Paper
Add Code

Interactive Conversational Head Generation

no code implementations • 5 Jul 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao

Based on ViCo and ViCo-X, we define three novel tasks targeting the interaction modeling during the face-to-face conversation: 1) responsive listening head generation making listeners respond actively to the speaker with non-verbal signals, 2) expressive talking head generation guiding speakers to be aware of listeners' behaviors, and 3) conversational head generation to integrate the talking/listening ability in one interlocutor.

Sentence Talking Head Generation

Paper
Add Code

Cross-Element Combinatorial Selection for Multi-Element Creative in Display Advertising

no code implementations • 4 Jul 2023 • Wei zhang, Ping Zhang, Jian Dong, Yongkang Wang, Pengye Zhang, Bo Zhang, Xingxing Wang, Dong Wang

The effectiveness of ad creatives is greatly influenced by their visual appearance.

Decoder

Paper
Add Code

SUGAR: Spherical Ultrafast Graph Attention Framework for Cortical Surface Registration

no code implementations • 2 Jul 2023 • Jianxun Ren, Ning An, Youjia Zhang, Danyang Wang, Zhenyu Sun, Cong Lin, Weigang Cui, Weiwei Wang, Ying Zhou, Wei zhang, Qingyu Hu, Ping Zhang, Dan Hu, Danhong Wang, Hesheng Liu

Cortical surface registration plays a crucial role in aligning cortical functional and anatomical features across individuals.

Computational Efficiency Data Augmentation +1

Paper
Add Code

Understanding recent deep-learning techniques for identifying collective variables of molecular dynamics

no code implementations • 1 Jul 2023 • Wei zhang, Christof Schütte

High-dimensional metastable molecular system can often be characterised by a few features of the system, i. e. collective variables (CVs).

Paper
Add Code

Deep Equilibrium Multimodal Fusion

no code implementations • 29 Jun 2023 • Jinhong Ni, Yalong Bai, Wei zhang, Ting Yao, Tao Mei

Multimodal fusion integrates the complementary information present in multiple modalities and has gained much attention recently.

Visual Question Answering (VQA)

Paper
Add Code

A Theory of Complex Adaptive Learning Behavior in Complex Adaptive Systems and a Non-Localized Wave Equation in Quantum Mechanics

no code implementations • 27 Jun 2023 • Leilei Shi, Xinshuai Guo, Jiuchang Wei, Wei zhang, Guocheng Wang, Bing-Hong Wang

Keywords: complex adaptive systems, complex adaptive learning, universal law, non-localized wave equation, interactively coherent entanglement, interactively coherent adaptation PACS: 89. 75.-k (Complex Systems); 89. 65. Gh (Economics, Econophysics, Financial Markets, Business and Management); 03. 65. Ud (Entanglement and Quantum Nonlocality)

Paper
Add Code

A Collaborative Transfer Learning Framework for Cross-domain Recommendation

no code implementations • 26 Jun 2023 • Wei zhang, Pengye Zhang, Bo Zhang, Xingxing Wang, Dong Wang

The disadvantage of the former is that the data from other domains is not utilized by a single domain model, while the latter leverage all the data from different domains, but the fine-tuned model of transfer learning may trap the model in a local optimum of the source domain, making it difficult to fit the target domain.

Click-Through Rate Prediction Recommendation Systems +1

Paper
Add Code

FlowFace++: Explicit Semantic Flow-supervised End-to-End Face Swapping

no code implementations • 22 Jun 2023 • Yu Zhang, Hao Zeng, Bowen Ma, Wei zhang, Zhimeng Zhang, Yu Ding, Tangjie Lv, Changjie Fan

The discriminator is shape-aware and relies on a semantic flow-guided operation to explicitly calculate the shape discrepancies between the target and source faces, thus optimizing the face swapping network to generate highly realistic results.

Decoder Face Swapping

Paper
Add Code

Visual-Aware Text-to-Speech

no code implementations • 21 Jun 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

Dynamically synthesizing talking speech that actively responds to a listening head is critical during the face-to-face interaction.

Speech Synthesis

Paper
Add Code

CMLM-CSE: Based on Conditional MLM Contrastive Learning for Sentence Embeddings

no code implementations • 16 Jun 2023 • Wei zhang, Xu Chen

Traditional comparative learning sentence embedding directly uses the encoder to extract sentence features, and then passes in the comparative loss function for learning.

Contrastive Learning Language Modelling +3

Paper
Add Code

ScrollTimes: Tracing the Provenance of Paintings as a Window into History

no code implementations • 15 Jun 2023 • Wei zhang, Wong Kam-Kwai, Yitian Chen, Ailing Jia, Luwei Wang, Jian-Wei Zhang, Lechao Cheng, Huamin Qu, Wei Chen

The study of cultural artifact provenance, tracing ownership and preservation, holds significant importance in archaeology and art history.

Paper
Add Code

PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

1 code implementation • 15 Jun 2023 • Runmin Cong, Wenyu Yang, Wei zhang, Chongyi Li, Chun-Le Guo, Qingming Huang, Sam Kwong

Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability.

Quantization UIE

Paper
Code

Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation

1 code implementation • 14 Jun 2023 • Xiao He, Chang Tang, Xinwang Liu, Wei zhang, Kun Sun, Jiangfeng Xu

S2ADet comprises a hyperspectral information decoupling (HID) module, a two-stream feature extraction network, and a one-stage detection head.

Object object-detection +1

Paper
Code

A Proxy Attack-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks

no code implementations • 14 Jun 2023 • Ziqiang Li, Hong Sun, Pengfei Xia, Beihao Xia, Xue Rui, Wei zhang, Qinglang Guo, Bin Li

This paper presents a Proxy attack-Free Strategy (PFS) designed to identify efficient poisoning samples based on individual similarity and ensemble diversity, effectively addressing the mentioned concern.

Active Learning Backdoor Attack

Paper
Add Code

Approximate Maximum-Likelihood RIS-Aided Positioning

no code implementations • 13 Jun 2023 • Wei zhang, Zhenni Wang, Wee Peng Tay

In this paper, we develop a RIS-aided positioning framework to locate a UE in environments where the LOS path may or may not be available.

Paper
Add Code

E2E-LOAD: End-to-End Long-form Online Action Detection

1 code implementation • ICCV 2023 • Shuqiang Cao, Weixin Luo, Bairui Wang, Wei zhang, Lin Ma

Furthermore, we propose a novel and efficient inference mechanism that accelerates heavy spatial-temporal exploration.

Online Action Detection

Paper
Code

NFTVis: Visual Analysis of NFT Performance

no code implementations • 5 Jun 2023 • Fan Yan, Xumeng Wang, Ketian Mao, Wei zhang, Wei Chen

A non-fungible token (NFT) is a data unit stored on the blockchain.

Time Series

Paper
Add Code

PDT: Pretrained Dual Transformers for Time-aware Bipartite Graphs

no code implementations • 2 Jun 2023 • Xin Dai, Yujie Fan, Zhongfang Zhuang, Shubham Jain, Chin-Chia Michael Yeh, Junpeng Wang, Liang Wang, Yan Zheng, Prince Osei Aboagye, Wei zhang

Pre-training on large models is prevalent and emerging with the ever-growing user-generated content in many machine learning application categories.

Contrastive Learning

Paper
Add Code

Can LLMs like GPT-4 outperform traditional AI tools in dementia diagnosis? Maybe, but not today

no code implementations • 2 Jun 2023 • Zhuo Wang, Rongzhen Li, Bowen Dong, Jie Wang, Xiuxing Li, Ning Liu, Chenhui Mao, Wei zhang, Liling Dong, Jing Gao, Jianyong Wang

In this paper, we explore the potential of LLMs such as GPT-4 to outperform traditional AI tools in dementia diagnosis.

Paper
Add Code

Hybrid Driven Learning for Channel Estimation in Intelligent Reflecting Surface Aided Millimeter Wave Communications

no code implementations • 30 May 2023 • Shuntian Zheng, Sheng Wu, Chunxiao Jiang, Wei zhang, Xiaojun Jing

Intelligent reflecting surfaces (IRS) have been proposed in millimeter wave (mmWave) and terahertz (THz) systems to achieve both coverage and capacity enhancement, where the design of hybrid precoders, combiners, and the IRS typically relies on channel state information.

Denoising

Paper
Add Code

Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation

1 code implementation • 25 May 2023 • Shilin Yan, Renrui Zhang, Ziyu Guo, Wenchao Chen, Wei zhang, Hongyang Li, Yu Qiao, Hao Dong, Zhongjiang He, Peng Gao

In this paper, we propose MUTR, a Multi-modal Unified Temporal transformer for Referring video object segmentation.

Ranked #1 on Referring Expression Segmentation on Referring Expressions for DAVIS 2016 & 2017

Object Referring Expression Segmentation +3

Paper
Code

MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition

1 code implementation • ICCV 2023 • Tianlun Zheng, Zhineng Chen, Bingchen Huang, Wei zhang, Yu-Gang Jiang

In this paper, we propose the Incremental MLTR (IMLTR) task in the context of incremental learning (IL), where different languages are introduced in batches.

Ranked #1 on Incremental Learning on MLT17

Continual Learning Incremental Learning +2

Paper
Code

MolXPT: Wrapping Molecules with Text for Generative Pre-training

no code implementations • 18 May 2023 • Zequn Liu, Wei zhang, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Ming Zhang, Tie-Yan Liu

Considering that text is the most important record for scientific discovery, in this paper, we propose MolXPT, a unified language model of text and molecules pre-trained on SMILES (a sequence representation of molecules) wrapped by text.

Ranked #1 on Molecular Property Prediction on ClinTox

Language Modelling Molecular Property Prediction +3

Paper
Add Code

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

3 code implementations • 28 Apr 2023 • Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao

This strategy effectively alleviates the interference between the two tasks of image-text alignment and instruction following and achieves strong multi-modal reasoning with only a small-scale image-text and instruction dataset.

Ranked #6 on Visual Question Answering (VQA) on InfiMM-Eval

Instruction Following Optical Character Recognition (OCR) +7

5,559

Paper
Code

STNet: Spatial and Temporal feature fusion network for change detection in remote sensing images

1 code implementation • 22 Apr 2023 • Xiaowen Ma, Jiawei Yang, Tingfeng Hong, Mengting Ma, Ziyan Zhao, Tian Feng, Wei zhang

As an important task in remote sensing image analysis, remote sensing change detection (RSCD) aims to identify changes of interest in a region from spatially co-registered multi-temporal remote sensing images, so as to monitor the local development.

Binary Classification Change Detection

Paper
Code

SACANet: scene-aware class attention network for semantic segmentation of remote sensing images

1 code implementation • 22 Apr 2023 • Xiaowen Ma, Rui Che, Tingfeng Hong, Mengting Ma, Ziyan Zhao, Tian Feng, Wei zhang

In this paper, we integrate both scene-aware and class attentions to propose a scene-aware class attention network (SACANet) for semantic segmentation of remote sensing images.

Semantic Segmentation

Paper
Code

OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping

1 code implementation • NeurIPS 2023 • Huijie Wang, Tianyu Li, Yang Li, Li Chen, Chonghao Sima, Zhenbo Liu, Bangjun Wang, Peijin Jia, Yuting Wang, Shengyin Jiang, Feng Wen, Hang Xu, Ping Luo, Junchi Yan, Wei zhang, Hongyang Li

Accurately depicting the complex traffic scene is a vital component for autonomous vehicles to execute correct judgments.

3D Lane Detection

500

Paper
Code

Network Pruning Spaces

no code implementations • 19 Apr 2023 • Xuanyu He, Yu-I Yang, Ran Song, Jiachen Pu, Conggang Hu, Feijun Jiang, Wei zhang, Huanghao Ding

Statistically, the structure of a winning subnetwork guarantees an approximately optimal ratio in this regime.

Network Pruning

Paper
Add Code

Frequency Decomposition to Tap the Potential of Single Domain for Generalization

no code implementations • 14 Apr 2023 • Qingyue Yang, Hongjing Niu, Pengfei Xia, Wei zhang, Bin Li

Then, a new method that learns through multiple frequency domains is proposed.

Domain Generalization

Paper
Add Code

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment

no code implementations • CVPR 2023 • Lewei Yao, Jianhua Han, Xiaodan Liang, Dan Xu, Wei zhang, Zhenguo Li, Hang Xu

This paper presents DetCLIPv2, an efficient and scalable training framework that incorporates large-scale image-text pairs to achieve open-vocabulary object detection (OVD).

Ranked #5 on Object Detection on ODinW Full-Shot 13 Tasks

Language Modelling object-detection +1

Paper
Add Code

RSPT: Reconstruct Surroundings and Predict Trajectories for Generalizable Active Object Tracking

no code implementations • 7 Apr 2023 • Fangwei Zhong, Xiao Bi, Yudi Zhang, Wei zhang, Yizhou Wang

However, building a generalizable active tracker that works robustly across different scenarios remains a challenge, especially in unstructured environments with cluttered obstacles and diverse layouts.

Autonomous Driving Object Tracking

Paper
Add Code

HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation

1 code implementation • CVPR 2023 • Linfang Zheng, Chen Wang, Yinghan Sun, Esha Dasgupta, Hua Chen, Ales Leonardis, Wei zhang, Hyung Jin Chang

In this paper, we focus on the problem of category-level object pose estimation, which is challenging due to the large intra-category shape variation.

Pose Estimation Translation

Paper
Code

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box

no code implementations • 27 Mar 2023 • Yifu Zhang, Xinggang Wang, Xiaoqing Ye, Wei zhang, Jincheng Lu, Xiao Tan, Errui Ding, Peize Sun, Jingdong Wang

We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes, which alleviates the problems of object missing and fragmented trajectories.

3D Multi-Object Tracking motion prediction +1

Paper
Add Code

Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection

1 code implementation • CVPR 2023 • Chang Liu, Weiming Zhang, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Xiaomao Li, Errui Ding, Jingdong Wang

It employs a "divide-and-conquer" strategy and separately exploits positives for the classification and localization task, which is more robust to the assignment ambiguity.

Ranked #1 on Semi-Supervised Object Detection on COCO 10% labeled data (detector metric)

Dense Object Detection Object +3

12,217

Paper
Code

How Does Attention Work in Vision Transformers? A Visual Analytics Attempt

no code implementations • 24 Mar 2023 • Yiran Li, Junpeng Wang, Xin Dai, Liang Wang, Chin-Chia Michael Yeh, Yan Zheng, Wei zhang, Kwan-Liu Ma

Multi-head self-attentions are then applied to the sequence to learn the attention between patches.

Paper
Add Code

Multi-modal Facial Affective Analysis based on Masked Autoencoder

no code implementations • 20 Mar 2023 • Wei zhang, Bowen Ma, Feng Qiu, Yu Ding

The CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW) is dedicated to providing high-quality and large-scale Aff-wild2 for the recognition of commonly used emotion representations, such as Action Units (AU), basic expression categories(EXPR), and Valence-Arousal (VA).

Paper
Add Code

SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning

1 code implementation • 15 Mar 2023 • Jinxiang Lai, Siqian Yang, Wenlong Wu, Tao Wu, Guannan Jiang, Xi Wang, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

Then we derive two specific attention modules, named SpatialFormer Semantic Attention (SFSA) and SpatialFormer Target Attention (SFTA), to enhance the target object regions while reduce the background distraction.

Few-Shot Learning

Paper
Code

LoG-CAN: local-global Class-aware Network for semantic segmentation of remote sensing images

1 code implementation • 14 Mar 2023 • Xiaowen Ma, Mengting Ma, Chenlu Hu, Zhiyuan Song, Ziyan Zhao, Tian Feng, Wei zhang

We present LoG-CAN, a multi-scale semantic segmentation network with a global class-aware (GCA) module and local class-aware (LCA) modules to remote sensing images.

Segmentation Semantic Segmentation

Paper
Code

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

no code implementations • CVPR 2023 • Yanxin Long, Youpeng Wen, Jianhua Han, Hang Xu, Pengzhen Ren, Wei zhang, Shen Zhao, Xiaodan Liang

Besides, our CapDet also achieves state-of-the-art performance on dense captioning tasks, e. g., 15. 44% mAP on VG V1. 2 and 13. 98% on the VG-COCO dataset.

Dense Captioning

Paper
Add Code

Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme

no code implementations • 27 Feb 2023 • Jianhao Huang, Dongxu Li, Chuan Huang, Xiaoqi Qin, Wei zhang

This paper proposes a deep separate source-channel coding (DSSCC) framework for the joint task and data oriented semantic communications (JTD-SC) and utilizes the variational autoencoder approach to solve the rate-distortion problem with semantic distortion.

Bayesian Inference Data Compression

Paper
Add Code

Entity-Level Text-Guided Image Manipulation

1 code implementation • 22 Feb 2023 • Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

Paper
Code

Simulation-to-reality UAV Fault Diagnosis with Deep Learning

no code implementations • 9 Feb 2023 • Wei zhang, Junjie Tong, Fang Liao, Yunfeng Zhang

Accurate diagnosis of propeller faults is crucial for ensuring the safe and efficient operation of quadrotors.

Domain Adaptation

Paper
Add Code

RIS-Position and Orientation Estimation in MIMO-OFDM Systems with Practical Scatterers

no code implementations • 9 Feb 2023 • Sheng Hong, Minghui Li, Cunhua Pan, Marco Di Renzo, Wei zhang, Lajos Hanzo

A two-step positioning scheme is exploited, where the channel parameters are first acquired, and the position-related parameters are then estimated.

Position

Paper
Add Code

Language-Driven Anchors for Zero-Shot Adversarial Robustness

1 code implementation • 30 Jan 2023 • Xiao Li, Wei zhang, Yining Liu, Zhanhao Hu, Bo Zhang, Xiaolin Hu

Previous researches mainly focus on improving adversarial robustness in the fully supervised setting, leaving the challenging domain of zero-shot adversarial robustness an open question.

Adversarial Defense Adversarial Robustness +3

Paper
Code

Deep-learning-based on-chip rapid spectral imaging with high spatial resolution

no code implementations • 16 Jan 2023 • Jiawei Yang, Kaiyu Cui, Yidong Huang, Wei zhang, Xue Feng, Fang Liu

Spectral imaging extends the concept of traditional color cameras to capture images across multiple spectral channels and has broad application prospects.

Autonomous Driving Metamerism +1

Paper
Add Code

EPR-Net: Constructing non-equilibrium potential landscape via a variational force projection formulation

1 code implementation • 5 Jan 2023 • Yue Zhao, Wei zhang, Tiejun Li

We present EPR-Net, a novel and effective deep learning approach that tackles a crucial challenge in biophysics: constructing potential landscapes for high-dimensional non-equilibrium steady-state (NESS) systems.

Dimensionality Reduction

Paper
Code

Machine Learning for Large-Scale Optimization in 6G Wireless Networks

no code implementations • 3 Jan 2023 • Yandong Shi, Lixiang Lian, Yuanming Shi, Zixin Wang, Yong Zhou, Liqun Fu, Lin Bai, Jun Zhang, Wei zhang

The sixth generation (6G) wireless systems are envisioned to enable the paradigm shift from "connected things" to "connected intelligence", featured by ultra high density, large-scale, dynamic heterogeneity, diversified functional requirements and machine learning capabilities, which leads to a growing need for highly efficient intelligent algorithms.

Computational Efficiency Distributed Optimization +2

Paper
Add Code

CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision

no code implementations • ICCV 2023 • Shuo Li, Yue He, Weiming Zhang , Wei zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang

Current state-of-the-art semi-supervised semantic segmentation (SSSS) methods typically adopt pseudo labeling and consistency regularization between multiple learners with different perturbations.

Semi-Supervised Semantic Segmentation

Paper
Add Code

Translating Images to Road Network: A Non-Autoregressive Sequence-to-Sequence Approach

no code implementations • ICCV 2023 • Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei zhang, Li Zhang

The extraction of road network is essential for the generation of high-definition maps since it enables the precise localization of road landmarks and their interconnections.

Paper
Add Code

WaterMask: Instance Segmentation for Underwater Imagery

1 code implementation • ICCV 2023 • Shijie Lian, Hua Li, Runmin Cong, Suqi Li, Wei zhang, Sam Kwong

Underwater image instance segmentation is a fundamental and critical step in underwater image analysis and understanding.

Ranked #1 on Instance Segmentation on UIIS

2D Object Detection Graph Attention +3

Paper
Code

A Deep Learning Method for Real-time Bias Correction of Wind Field Forecasts in the Western North Pacific

no code implementations • 29 Dec 2022 • Wei zhang, Yueyue Jiang, Junyu Dong, Xiaojiang Song, Renbo Pang, Boyu Guoan, Hui Yu

In this study, we developed the Multi-Task-Double Encoder Trajectory Gated Recurrent Unit (MT-DETrajGRU) model, which uses an improved double-encoder forecaster architecture to model the spatiotemporal sequence of the U and V components of the wind field; we designed a multi-task learning loss function to correct wind speed and wind direction simultaneously using only one model.

Multi-Task Learning

Paper
Add Code

Circular Accessible Depth: A Robust Traversability Representation for UGV Navigation

no code implementations • 28 Dec 2022 • Shikuan Xie, Ran Song, Yuenan Zhao, Xueqin Huang, Yibin Li, Wei zhang

In this paper, we present the Circular Accessible Depth (CAD), a robust traversability representation for an unmanned ground vehicle (UGV) to learn traversability in various scenarios containing irregular obstacles.

Paper
Add Code

Semantic optical fiber communication system

no code implementations • 27 Dec 2022 • Zhenming Yu, Hongyu Huang, Liming Cheng, Wei zhang, Yueqiu Mu, Kun Xu

The current optical communication systems minimize bit or symbol errors without considering the semantic meaning behind digital bits, thus transmitting a lot of unnecessary information.

Paper
Add Code

Differentiating Student Feedbacks for Knowledge Tracing

no code implementations • 16 Dec 2022 • Jiajun Cui, Wei zhang

In computer-aided education and intelligent tutoring systems, knowledge tracing (KT) raises attention due to the development of data-driven learning methods, which aims to predict students' future performance given their past question response sequences to trace their knowledge states.

Knowledge Tracing

Paper
Add Code

Adaptive Low-Precision Training for Embeddings in Click-Through Rate Prediction

no code implementations • 12 Dec 2022 • Shiwei Li, Huifeng Guo, Lu Hou, Wei zhang, Xing Tang, Ruiming Tang, Rui Zhang, Ruixuan Li

To this end, we formulate a novel quantization training paradigm to compress the embeddings from the training stage, termed low-precision training (LPT).

Click-Through Rate Prediction Quantization

Paper
Add Code

Low-rank Tensor Assisted K-space Generative Model for Parallel Imaging Reconstruction

no code implementations • 11 Dec 2022 • Wei zhang, Zengwei Xiao, Hui Tao, Minghui Zhang, Xiaoling Xu, Qiegen Liu

Although recent deep learning methods, especially generative models, have shown good performance in fast magnetic resonance imaging, there is still much room for improvement in high-dimensional generation.

Paper
Add Code

Matrix Profile XXVII: A Novel Distance Measure for Comparing Long Time Series

no code implementations • 9 Dec 2022 • Audrey Der, Chin-Chia Michael Yeh, Renjie Wu, Junpeng Wang, Yan Zheng, Zhongfang Zhuang, Liang Wang, Wei zhang, Eamonn Keogh

PRCIS is a distance measure for long time series, which exploits recent progress in our ability to summarize time series with dictionaries.

Anomaly Detection Dynamic Time Warping +2

Paper
Add Code

Dynamic Graph Node Classification via Time Augmentation

no code implementations • 7 Dec 2022 • Jiarui Sun, Mengting Gu, Chin-Chia Michael Yeh, Yujie Fan, Girish Chowdhary, Wei zhang

Node classification on dynamic graphs is challenging for two reasons.

Classification Node Classification

Paper
Add Code

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

1 code implementation • 7 Dec 2022 • Zhongwei Wan, Yichun Yin, Wei zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e. g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora.

General Knowledge Language Modelling +3

Paper
Code

FlowFace: Semantic Flow-guided Shape-aware Face Swapping

no code implementations • 6 Dec 2022 • Hao Zeng, Wei zhang, Changjie Fan, Tangjie Lv, Suzhen Wang, Zhimeng Zhang, Bowen Ma, Lincheng Li, Yu Ding, Xin Yu

Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping.

Face Swapping

Paper
Add Code

Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces

no code implementations • AMTA 2022 • Prince O Aboagye, Yan Zheng, Michael Yeh, Junpeng Wang, Zhongfang Zhuang, Huiyuan Chen, Liang Wang, Wei zhang, Jeff Phillips

Optimal Transport (OT) provides a useful geometric framework to estimate the permutation matrix under unsupervised cross-lingual word embedding (CLWE) models that pose the alignment task as a Wasserstein-Procrustes problem.

Bilingual Lexicon Induction Quantization

Paper
Add Code

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation

no code implementations • 2 Dec 2022 • Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei zhang, Xiaojun Chang, Hang Xu

Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a text-to-views generation module and a views-to-3D generation module.

3D Generation Contrastive Learning +2

Paper
Add Code

Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

no code implementations • 23 Nov 2022 • Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao, Tianliang Zhang, Wenlong Wu, Wei zhang, Chengjie Wang, Yuan Xie

This paper builds a unified framework to perform effective noisy-proposal suppression and to interact between global and local features for robust feature learning.

Feature Correlation Multi-Label Image Classification

Paper
Add Code

RIS-Assisted Self-Interference Mitigation for In-Band Full-Duplex Transceivers

no code implementations • 22 Nov 2022 • Wei zhang, Yi Jiang, Bin Zhou

The wireless in-band full-duplex (IBFD) technology can in theory double the system capacity over the conventional frequency division duplex (FDD) or time-division duplex (TDD) alternatives.

Quantization

Paper
Add Code

LVOS: A Benchmark for Long-term Video Object Segmentation

1 code implementation • ICCV 2023 • Lingyi Hong, Wenchao Chen, Zhongying Liu, Wei zhang, Pinxue Guo, Zhaoyu Chen, Wenqiang Zhang

The videos in our LVOS last 1. 59 minutes on average, which is 20 times longer than videos in existing VOS datasets.

Object Semantic Segmentation +2

Paper
Code

Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions

no code implementations • 5 Nov 2022 • Wei zhang, Yanjun Han, Zhengyuan Zhou, Aaron Flores, Tsachy Weissman

In the past four years, a particularly important development in the digital advertising industry is the shift from second-price auctions to first-price auctions for online display ads.

Marketing

Paper
Add Code

Rethinking the Metric in Few-shot Learning: From an Adaptive Multi-Distance Perspective

no code implementations • 2 Nov 2022 • Jinxiang Lai, Siqian Yang, Guannan Jiang, Xi Wang, Yuxi Li, Zihui Jia, Xiaochen Chen, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

In this paper, for the first time, we investigate the contributions of different distance metrics, and propose an adaptive fusion scheme, bringing significant improvements in few-shot classification.

Few-Shot Learning

Paper
Add Code

Facial Action Unit Detection and Intensity Estimation from Self-supervised Representation

no code implementations • 28 Oct 2022 • Bowen Ma, Rudong An, Wei zhang, Yu Ding, Zeng Zhao, Rongsheng Zhang, Tangjie Lv, Changjie Fan, Zhipeng Hu

As a fine-grained and local expression behavior measurement, facial action unit (FAU) analysis (e. g., detection and intensity estimation) has been documented for its time-consuming, labor-intensive, and error-prone annotation.

Action Unit Detection Facial Action Unit Detection

Paper
Add Code

Global-to-local Expression-aware Embeddings for Facial Action Unit Detection

no code implementations • 27 Oct 2022 • Rudong An, Wei zhang, Hao Zeng, Wei Chen, Zhigang Deng, Yu Ding

Then, AU feature maps and their corresponding AU masks are multiplied to generate AU masked features focusing on local facial region.

Action Unit Detection Facial Action Unit Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.