Search Results for author: Wei Li

Found 343 papers, 120 papers with code

基于强化学习的古今汉语句子对齐研究(Research on Sentence Alignment of Ancient and Modern Chinese based on Reinforcement Learning)

no code implementations CCL 2022 Kuai Yu, Yanqiu Shao, Wei Li

“基于深度学习的有监督机器翻译取得了良好的效果, 但训练过程中需要大量质量较高的对齐语料。对于中文古今翻译场景, 高质量的平行语料并不多, 而粗对齐的篇章、段语料比较容易获得, 因此语料对齐很有研究价值和研究必要。在传统双语平行语料的句子对齐研究中, 传统方法根据双语文本中的长度、词汇、共现文字等语法信息, 建立一个综合评判标准来衡量两个句对之间相似度。此类方法虽然在单句对齐上取得了较好的效果, 但是对于句子语义匹配的能力有限, 并且在一些多对多的对齐模式上的性能表现不佳。在本文中我们提出尝试利用现在发展迅速且具有强大语义表示能力的预训练语言模型来考虑双语的语义信息, 但是单独使用预训练语言模型只能考虑相对局部的信息, 因此我们提出采用基于动态规划算法的强化学习训练目标来整合段落全局信息, 并且进行无监督训练。实验结果证明我们提出的方法训练得到的模型性能优于此前获得最好表现的基线模型, 尤其相较于传统模型难以处理的多对多对齐模式下, 性能提升较大。”

Sentence

Towards Efficient Coarse-to-Fine Networks for Action and Gesture Recognition

no code implementations ECCV 2020 Niamul Quader, Juwei Lu, Peng Dai, Wei Li

State-of-the-art approaches to video-based action and gesture recognition often employ two key concepts: First, they employ multistream processing; second, they use an ensemble of convolutional networks.

3D Action Recognition Action Classification +3

SgSum:Transforming Multi-document Summarization into Sub-graph Selection

1 code implementation EMNLP 2021 Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang

Comparing with traditional methods, our method has two main advantages: (1) the relations between sentences are captured by modeling both the graph structure of the whole document set and the candidate sub-graphs; (2) directly outputs an integrate summary in the form of sub-graph which is more informative and coherent.

Document Summarization Multi-Document Summarization +1

Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks

1 code implementation ECCV 2020 Niamul Quader, Md Mafijul Islam Bhuiyan, Juwei Lu, Peng Dai, Wei Li

We propose novel approaches for simultaneously identifying important weights of a convolutional neural network (ConvNet) and providing more attention to the important weights during training.

3D Action Recognition 3D Object Classification +7

《二十四史》古代汉语语义依存图库构建(Construction of Semantic Dependency Graph Bank of Ancient Chinese in twenty four histories)

no code implementations CCL 2022 Tian Huang, Yanqiu Shao, Wei Li

“语义依存图是NLP处理语义的深层分析方法, 能够对句子中词与词之间的语义进行分析。该文针对古代汉语特点, 在制定古代汉语语义依存图标注规范的基础上, 以《二十四史》为语料来源, 完成标注了规模为3000句的古代汉语语义依存图库, 标注一致性的kappa值为78. 83%。通过与现代汉语语义依存图库的对比, 对依存图库基本情况进行统计, 分析古代汉语的语义特色和规律。统计显示, 古代汉语语义分布宏观上符合齐普夫定律, 在语义事件描述上具有强烈的历史性叙事和正式文体特征, 如以人物纪传为中心, 时间、地点等周边角色描述细致, 叙事语言冷静客观, 缺少描述情态、语气、程度、时间状态等的修饰词语等。 "

基于统一模型的藏文新闻摘要(Abstractive Summarization of Tibetan News Based on Hybrid Model)

no code implementations CCL 2020 Xiaodong Yan, Xiaoqing Xie, Yu Zou, Wei Li

Seq2seq神经网络模型在中英文文本摘要的研究中取得了良好的效果, 但在低资源语言的文本摘要研究还处于探索阶段, 尤其是在藏语中。此外, 目前还没有大规模的标注语料库进行摘要提取。本文提出了一种生成藏文新闻摘要的统一模型。利用TextRank算法解决了藏语标注训练数据不足的问题。然后, 采用两层双GRU神经网络提取代表原始新闻的句子, 减少冗余信息。最后, 使用基于注意力机制的Seq2Seq来生成理解式摘要。同时, 我们加入了指针网络来处理未登录词的问题。实验结果表明, ROUGE-1评分比传统模型提高了2%。 关键词:文本摘要;藏文;TextRank; 指针网络;Bi-GRU

Abstractive Text Summarization

Meta-CQG: A Meta-Learning Framework for Complex Question Generation over Knowledge Bases

no code implementations COLING 2022 Kun Zhang, Yunqi Qiu, Yuanzhuo Wang, Long Bai, Wei Li, Xuhui Jiang, HuaWei Shen, Xueqi Cheng

Complex question generation over knowledge bases (KB) aims to generate natural language questions involving multiple KB relations or functional constraints.

Contrastive Learning Meta-Learning +2

An Image Enhancement Method for Improving Small Intestinal Villi Clarity

no code implementations25 Feb 2024 Shaojie Zhang, Yinghui Wang, Peixuan Liu, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

Subsequently, an adaptive light gain factor is generated based on the low-frequency component, and an adaptive gradient gain factor is derived from the convolution results of the Laplacian operator in different regions of small intestinal villi images.

Image Enhancement

Unlocking the Power of Large Language Models for Entity Alignment

no code implementations23 Feb 2024 Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, HuaWei Shen, Yuanzhuo Wang

To address the constraints of limited input KG data, ChatEA introduces a KG-code translation module that translates KG structures into a format understandable by LLMs, thereby allowing LLMs to utilize their extensive background knowledge to improve EA accuracy.

Code Translation Entity Alignment +2

An Error-Matching Exclusion Method for Accelerating Visual SLAM

no code implementations22 Feb 2024 Shaojie Zhang, Yinghui Wang, Jiaxing Ma, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

In Visual SLAM, achieving accurate feature matching consumes a significant amount of time, severely impacting the real-time performance of the system.

A Feature Matching Method Based on Multi-Level Refinement Strategy

no code implementations21 Feb 2024 Shaojie Zhang, Yinghui Wang, Jiaxing Ma, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

Feature matching is a fundamental and crucial process in visual SLAM, and precision has always been a challenging issue in feature matching.

A Robust Error-Resistant View Selection Method for 3D Reconstruction

no code implementations18 Feb 2024 Shaojie Zhang, Yinghui Wang, Bin Nan, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

To address the issue of increased triangulation uncertainty caused by selecting views with small camera baselines in Structure from Motion (SFM) view selection, this paper proposes a robust error-resistant view selection method.

3D Reconstruction

Region Feature Descriptor Adapted to High Affine Transformations

no code implementations15 Feb 2024 Shaojie Zhang, Yinghui Wang, Bin Nan, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

To address the issue of feature descriptors being ineffective in representing grayscale feature information when images undergo high affine transformations, leading to a rapid decline in feature matching accuracy, this paper proposes a region feature descriptor based on simulating affine transformations using classification.

A Highlight Removal Method for Capsule Endoscopy Images

no code implementations11 Feb 2024 Shaojie Zhang, Yinghui Wang, Peixuan Liu, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

The images captured by Wireless Capsule Endoscopy (WCE) always exhibit specular reflections, and removing highlights while preserving the color and texture in the region remains a challenge.

highlight removal

UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion

no code implementations24 Jan 2024 Wei Li, Xue Xu, Jiachen Liu, Xinyan Xiao

This paper presents UNIMO-G, a simple multimodal conditional diffusion framework that operates on multimodal prompts with interleaved textual and visual inputs, which demonstrates a unified ability for both text-driven and subject-driven image generation.

Conditional Image Generation Denoising +5

OMG-Seg: Is One Model Good Enough For All Segmentation?

1 code implementation18 Jan 2024 Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy

In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models.

Interactive Segmentation Panoptic Segmentation +3

Evolutionary Alternating Direction Method of Multipliers for Constrained Multi-Objective Optimization with Unknown Constraints

no code implementations2 Jan 2024 Shuang Li, Ke Li, Wei Li, Ming Yang

Constrained multi-objective optimization problems (CMOPs) pervade real-world applications in science, engineering, and design.

A Generalist FaceX via Learning Unified Facial Representation

1 code implementation31 Dec 2023 Yue Han, Jiangning Zhang, Junwei Zhu, Xiangtai Li, Yanhao Ge, Wei Li, Chengjie Wang, Yong liu, Xiaoming Liu, Ying Tai

This work presents FaceX framework, a novel facial generalist model capable of handling diverse facial tasks simultaneously.

Facial Editing

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

no code implementations25 Dec 2023 Li Xiang, Junbo Yin, Wei Li, Cheng-Zhong Xu, Ruigang Yang, Jianbing Shen

Specifically, DMA builds a domain-mixing 3D instance bank for the teacher and student models during training, resulting in aligned data representation.

3D Object Detection object-detection +1

Domain Similarity-Perceived Label Assignment for Domain Generalized Underwater Object Detection

no code implementations20 Dec 2023 Xisheng Li, Wei Li, Pinhao Song, Mingjun Zhang, Jie zhou

The inherent characteristics and light fluctuations of water bodies give rise to the huge difference between different layers and regions in underwater environments.

Data Augmentation object-detection +1

UINav: A maker of UI automation agents

no code implementations15 Dec 2023 Wei Li, Fu-Lin Hsu, Will Bishop, Folawiyo Campbell-Ajala, Oriana Riva, Max Lin

UINav agents are lightweight enough to run on mobile devices, yet they achieve high success rates with a modest number of task demonstrations.

CGS-Mask: Making Time Series Predictions Intuitive for All

no code implementations15 Dec 2023 Feng Lu, Wei Li, Yifei Sun, Cheng Song, Yufei Ren, Albert Y. Zomaya

Artificial intelligence (AI) has immense potential in time series prediction, but most explainable tools have limited capabilities in providing a systematic understanding of important features over time.

Decision Making Feature Importance +2

Brain Computer Interface Technology for Future Battlefield

no code implementations13 Dec 2023 Guodong Xiong, Xinyan Ma, Wei Li, Jiaqi Cao, Jian Zhong, Yicong Su

With the development of artificial intelligence and unmanned equipment, human-machine hybrid formations will be the main focus in future combat formations.

Brain Computer Interface Decision Making

CBQ: Cross-Block Quantization for Large Language Models

no code implementations13 Dec 2023 Xin Ding, Xiaoyu Liu, Zhijun Tu, Yun Zhang, Wei Li, Jie Hu, Hanting Chen, Yehui Tang, Zhiwei Xiong, Baoqun Yin, Yunhe Wang

Post-training quantization (PTQ) has played a key role in compressing large language models (LLMs) with ultra-low costs.

Quantization

GenDet: Towards Good Generalizations for AI-Generated Image Detection

1 code implementation12 Dec 2023 Mingjian Zhu, Hanting Chen, Mouxiao Huang, Wei Li, Hailin Hu, Jie Hu, Yunhe Wang

The misuse of AI imagery can have harmful societal effects, prompting the creation of detectors to combat issues like the spread of fake news.

Anomaly Detection

Knowledge Graph Driven Recommendation System Algorithm

no code implementations1 Dec 2023 Chaoyang Zhang, Yanan Li, Shen Chen, Siwei Fan, Wei Li

We first use a single-layer neural network to merge individual node features in the graph, and then adjust the aggregation weights of neighboring entities by incorporating influence factors.

Unsupervised learning of site percolation based on shuffled configurations

no code implementations20 Nov 2023 Dian Xu, Shanshan Wang, Feng Gao, Wei Li, Jianmin Shen

In the field of statistical physics, machine learning has gained significant popularity and has achieved remarkable results in recent studies on phase transitions. In this paper, we apply Principal Component Analysis (PCA) and Autoencoder(AE) based on Unsupervised learning to study the various configurations of the percolation model in equilibrium phase transition.

FireMatch: A Semi-Supervised Video Fire Detection Network Based on Consistency and Distribution Alignment

no code implementations9 Nov 2023 Qinghua Lin, Zuoyong Li, Kun Zeng, Haoyi Fan, Wei Li, Xiaoguang Zhou

Considering the limited quantity of labeled video data, we propose a semi-supervised fire detection model called FireMatch, which is based on consistency regularization and adversarial distribution alignment.

Data Augmentation Fairness +2

An invariant feature extraction for multi-modal images matching

no code implementations6 Nov 2023 Chenzhong Gao, Wei Li

This paper aims at providing an effective multi-modal images invariant feature extraction and matching algorithm for the application of multi-source data analysis.

Video-Helpful Multimodal Machine Translation

1 code implementation31 Oct 2023 Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li

In addition to the extensive training set, EVA contains a video-helpful evaluation set in which subtitles are ambiguous, and videos are guaranteed helpful for disambiguation.

Multimodal Machine Translation Translation

SALMONN: Towards Generic Hearing Abilities for Large Language Models

1 code implementation20 Oct 2023 Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to the perception and understanding of general auditory information consisting of at least three types of sounds: speech, audio events, and music.

Audio captioning Automatic Speech Recognition +10

MERTech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model With Multi-Task Finetuning

no code implementations15 Oct 2023 Dichucheng Li, Yinghao Ma, Weixing Wei, Qiuqiang Kong, Yulun Wu, Mingjin Che, Fan Xia, Emmanouil Benetos, Wei Li

Recognizing the significance of pitch in capturing the nuances of IPTs and the importance of onset in locating IPT events, we investigate multi-task finetuning with pitch and onset detection as auxiliary tasks.

Instrument Playing Technique Detection Self-Supervised Learning

On the Convergence of Federated Averaging under Partial Participation for Over-parameterized Neural Networks

no code implementations9 Oct 2023 Xin Liu, Wei Li, Dazhi Zhan, Yu Pan, Xin Ma, Yu Ding, Zhisong Pan

Federated learning (FL) is a widely employed distributed paradigm for collaboratively training machine learning models from multiple clients without sharing local data.

Federated Learning

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

2 code implementations9 Oct 2023 Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Audio-visual large language models (LLM) have drawn significant attention, yet the fine-grained combination of both input streams is rather under-explored, which is challenging but necessary for LLMs to understand general video inputs.

Question Answering Video Question Answering

Cross-head mutual Mean-Teaching for semi-supervised medical image segmentation

1 code implementation8 Oct 2023 Wei Li, Ruifeng Bian, Wenyi Zhao, Weijin Xu, Huihua Yang

To address these concerns, we propose a novel Cross-head mutual mean-teaching Network (CMMT-Net) incorporated strong-weak data augmentation, thereby benefitting both self-training and consistency learning.

Data Augmentation Image Segmentation +2

A Holistic Evaluation of Piano Sound Quality

no code implementations7 Oct 2023 Monan Zhou, Shangda Wu, Shaohua Ji, Zijin Li, Wei Li

Unlike previous studies that focused on the effect of piano performance techniques on sound quality, this study evaluates the inherent sound quality of different pianos.

Few-Shot Learning

Model2Scene: Learning 3D Scene Representation via Contrastive Language-CAD Models Pre-training

no code implementations29 Sep 2023 Runnan Chen, Xinge Zhu, Nenglun Chen, Dawei Wang, Wei Li, Yuexin Ma, Ruigang Yang, Tongliang Liu, Wenping Wang

In this paper, we propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages.

3D Semantic Segmentation Object

IFT: Image Fusion Transformer for Ghost-free High Dynamic Range Imaging

no code implementations26 Sep 2023 Hailing Wang, Wei Li, Yuanyuan Xi, Jie Hu, Hanting Chen, Longyu Li, Yunhe Wang

By matching similar patches between frames, objects with large motion ranges in dynamic scenes can be aligned, which can effectively alleviate the generation of artifacts.

Connecting Speech Encoder and Large Language Model for ASR

no code implementations25 Sep 2023 Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Q-Former-based LLMs can generalise well to out-of-domain datasets, where 12% relative WER reductions over the Whisper baseline ASR model were achieved on the Eval2000 test set without using any in-domain training data from Switchboard.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Data Upcycling Knowledge Distillation for Image Super-Resolution

no code implementations25 Sep 2023 Yun Zhang, Wei Li, Simiao Li, Jie Hu, Hanting Chen, Hailing Wang, Zhijun Tu, Wenjia Wang, BingYi Jing, Yunhe Wang

In this paper, we put forth an approach from the perspective of effective data utilization, namely, the Data Upcycling Knowledge Distillation (DUKD), which facilitates the student model by the prior knowledge the teacher provided through the upcycled in-domain data derived from the input images.

Image Super-Resolution Knowledge Distillation +1

WikiMT++ Dataset Card

no code implementations23 Sep 2023 Monan Zhou, Shangda Wu, YuAn Wang, Wei Li

WikiMT++ is an expanded and refined version of WikiMusicText (WikiMT), featuring 1010 curated lead sheets in ABC notation.

Emotion Classification Information Retrieval +3

MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation

1 code implementation22 Sep 2023 Jiahao Xie, Wei Li, Xiangtai Li, Ziwei Liu, Yew Soon Ong, Chen Change Loy

We present MosaicFusion, a simple yet effective diffusion-based data augmentation approach for large vocabulary instance segmentation.

Data Augmentation Instance Segmentation +1

MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models

no code implementations21 Sep 2023 Yidong Liu, FuKai Shang, Fang Wang, Rui Xu, Jun Wang, Wei Li, Yao Li, Conghui He

With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains.

SoccerNet 2023 Challenges Results

2 code implementations12 Sep 2023 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

VIGC: Visual Instruction Generation and Correction

2 code implementations24 Aug 2023 Bin Wang, Fan Wu, Xiao Han, Jiahui Peng, Huaping Zhong, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He

A practical solution to this problem would be to utilize the available multimodal large language models (MLLMs) to generate instruction data for vision-language tasks.

Hallucination Image Captioning +1

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models

1 code implementation21 Aug 2023 Conghui He, Zhenjiang Jin, Chao Xu, Jiantao Qiu, Bin Wang, Wei Li, Hang Yan, Jiaqi Wang, Dahua Lin

The rise in popularity of ChatGPT and GPT-4 has significantly accelerated the development of large models, leading to the creation of numerous impressive large language models(LLMs) and multimodal large language models (MLLMs).

Improving Anomaly Segmentation with Multi-Granularity Cross-Domain Alignment

no code implementations16 Aug 2023 Ji Zhang, Xiao Wu, Zhi-Qi Cheng, Qi He, Wei Li

Anomaly segmentation plays a pivotal role in identifying atypical objects in images, crucial for hazard detection in autonomous driving systems.

Autonomous Driving Contrastive Learning

A Self-supervised SAR Image Despeckling Strategy Based on Parameter-sharing Convolutional Neural Networks

no code implementations11 Aug 2023 Liang Chen, Yifei Yin, Hao Shi, Qingqing Sheng, Wei Li

The training image pairs are generated by the sub-sampler from real-word SAR image to estimate the noise distribution.

Sar Image Despeckling

Hyper-pixel-wise Contrastive Learning Augmented Segmentation Network for Old Landslide Detection through Fusing High-Resolution Remote Sensing Images and Digital Elevation Model Data

no code implementations2 Aug 2023 Yiming Zhou, Yuexing Peng, Wei Li, Junchuan Yu, Daqing Ge, Wei Xiang

To extract accurate semantic features, a hyper-pixel-wise contrastive learning augmented segmentation network (HPCL-Net) is proposed, which augments the local salient feature extraction from boundaries of landslides through HPCL-Net and fuses heterogeneous infromation in the semantic space from high-resolution remote sensing images and digital elevation model data.

Contrastive Learning Landslide segmentation

Improving Generalization in Visual Reinforcement Learning via Conflict-aware Gradient Agreement Augmentation

no code implementations ICCV 2023 Siao Liu, Zhaoyu Chen, Yang Liu, Yuzheng Wang, Dingkang Yang, Zhile Zhao, Ziqing Zhou, Xie Yi, Wei Li, Wenqiang Zhang, Zhongxue Gan

In particular, CG2A develops a Gradient Agreement Solver to adaptively balance the varying gradient magnitudes, and introduces a Soft Gradient Surgery strategy to alleviate the gradient conflicts.

reinforcement-learning

Adaptive Graph Convolution Networks for Traffic Flow Forecasting

1 code implementation7 Jul 2023 Zhengdao Li, Wei Li, Kai Hwang

The AGC-net is constructed by the Adaptive Graph Convolution (AGC) based on a novel context attention mechanism, which consists of a set of graph wavelets with various learnable scales.

NeMO: Neural Map Growing System for Spatiotemporal Fusion in Bird's-Eye-View and BDD-Map Benchmark

no code implementations7 Jun 2023 Xi Zhu, Xiya Cao, Zhiwei Dong, Caifa Zhou, Qiangbo Liu, Wei Li, Yongliang Wang

We also provide a new scene-level BEV map evaluation setting along with the corresponding baseline for a more comprehensive comparison.

Autonomous Driving Time Series

Balancing Logit Variation for Long-tailed Semantic Segmentation

1 code implementation CVPR 2023 Yuchao Wang, Jingjing Fei, Haochen Wang, Wei Li, Tianpeng Bao, Liwei Wu, Rui Zhao, Yujun Shen

In this way, we manage to close the gap between the feature areas of different categories, resulting in a more balanced representation.

Semantic Segmentation

Contextual Object Detection with Multimodal Large Language Models

1 code implementation29 May 2023 Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy

Moreover, we present ContextDET, a unified multimodal model that is capable of end-to-end differentiable modeling of visual-language contexts, so as to locate, identify, and associate visual objects with language inputs for human-AI interaction.

Cloze Test Image Captioning +6

On the Value of Myopic Behavior in Policy Reuse

no code implementations28 May 2023 Kang Xu, Chenjia Bai, Shuang Qiu, Haoran He, Bin Zhao, Zhen Wang, Wei Li, Xuelong Li

Leveraging learned strategies in unfamiliar scenarios is fundamental to human intelligence.

Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation

no code implementations23 May 2023 Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Liwei Wu, Yuxi Wang, Zhaoxiang Zhang

To this end, we propose T2S-DA, which we interpret as a form of pulling Target to Source for Domain Adaptation, encouraging the model in learning similar cross-domain features.

Domain Generalization Semantic Segmentation

Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring

no code implementations19 May 2023 Kaiqi Fu, Shaojun Gao, Shuju Shi, Xiaohai Tian, Wei Li, Zejun Ma

Specifically, we first pre-train the model using a reconstruction loss function, by masking phones and their durations jointly on a large amount of unlabeled speech and text prompts.

Self-Supervised Learning

PaLM 2 Technical Report

1 code implementation17 May 2023 Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

 Ranked #1 on Question Answering on TriviaQA (using extra training data)

Language Modelling Multi-task Language Understanding +1

Mlinear: Rethink the Linear Model for Time-series Forecasting

no code implementations8 May 2023 Wei Li, Xiangxu Meng, Chuhao Chen, Jianing Chen

In this paper, we carefully examine the opposing properties of CI and CD, and raise a practical question that has not been effectively answered, e. g.,"How to effectively mix the CI and CD properties of time series to achieve better predictive performance?"

Philosophy Time Series +1

TransHP: Image Classification with Hierarchical Prompting

1 code implementation NeurIPS 2023 Wenhao Wang, Yifan Sun, Wei Li, Yi Yang

This paper explores a hierarchical prompting mechanism for the hierarchical image classification (HIC) task.

Classification Image Classification

Adaptive Mask Sampling and Manifold to Euclidean Subspace Learning with Distance Covariance Representation for Hyperspectral Image Classification

1 code implementation IEEE Transactions on Geoscience and Remote Sensing 2023 Mingsong Li, Wei Li, Yikun Liu, Yuwen Huang, and Gongping Yang.

Subsequently, based on distance covariance descriptor, a dual channel distance covariance representation (DC-DCR) module is proposed for modeling unified spectral-spatial feature representations and exploring spectral-spatial relationships, especially linear and nonlinear interdependence in spectral domain.

 Ranked #1 on Hyperspectral Image Classification on Indian Pines (OA@5%perclass metric)

Hyperspectral image analysis Hyperspectral Image Classification +1

Siamese DETR

1 code implementation CVPR 2023 Zeren Chen, Gengshi Huang, Wei Li, Jianing Teng, Kun Wang, Jing Shao, Chen Change Loy, Lu Sheng

In this work, we present Siamese DETR, a Siamese self-supervised pretraining approach for the Transformer architecture in DETR.

MULTI-VIEW LEARNING Representation Learning

Correlational Image Modeling for Self-Supervised Visual Pre-Training

1 code implementation CVPR 2023 Wei Li, Jiahao Xie, Chen Change Loy

We introduce Correlational Image Modeling (CIM), a novel and surprisingly effective approach to self-supervised visual pre-training.

DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training

1 code implementation6 Mar 2023 Wei Li, Linchao Zhu, Longyin Wen, Yi Yang

This decoder is both data-efficient and computation-efficient: 1) it only requires the text data for training, easing the burden on the collection of paired data.

Image Captioning Text Generation

Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation

no code implementations CVPR 2023 Rui Zhao, Wei Li, Zhipeng Hu, Lincheng Li, Zhengxia Zou, Zhenwei Shi, Changjie Fan

In our method, taking the power of large-scale pre-trained multi-modal CLIP and neural rendering, T2P searches both continuous facial parameters and discrete facial parameters in a unified framework.

Face Model Neural Rendering +2

Efficient Masked Autoencoders with Self-Consistency

no code implementations28 Feb 2023 Zhaowen Li, Yousong Zhu, Zhiyang Chen, Wei Li, Chaoyang Zhao, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

However, its high random mask ratio would result in two serious problems: 1) the data are not efficiently exploited, which brings inefficient pre-training (\eg, 1600 epochs for MAE $vs.$ 300 epochs for the supervised), and 2) the high uncertainty and inconsistency of the pre-trained model, \ie, the prediction of the same patch may be inconsistent under different mask rounds.

Language Modelling Masked Language Modeling +3

An Iterative Classification and Semantic Segmentation Network for Old Landslide Detection Using High-Resolution Remote Sensing Images

no code implementations24 Feb 2023 Zili Lu, Yuexing Peng, Wei Li, Junchuan Yu, Daqing Ge, Wei Xiang

An object-level contrastive learning (OCL) strategy is employed in the object classification sub-network featuring a siamese network to realize the global features extraction, and a sub-object-level contrastive learning (SOCL) paradigm is designed in the semantic segmentation sub-network to efficiently extract salient features from boundaries of landslides.

Classification Contrastive Learning +3

Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

no code implementations21 Feb 2023 Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i. e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation.

An ASR-free Fluency Scoring Approach with Self-Supervised Learning

no code implementations20 Feb 2023 Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

AIIR-MIX: Multi-Agent Reinforcement Learning Meets Attention Individual Intrinsic Reward Mixing Network

no code implementations19 Feb 2023 Wei Li, Weiyan Liu, Shitong Shao, Shiyi Huang

The results show that AIIR-MIX can dynamically assign each agent a real-time intrinsic reward in accordance with their actual contribution.

Multi-agent Reinforcement Learning Starcraft +1

Objective Evaluation-based High-efficiency Learning Framework for Hyperspectral Image Classification

no code implementations10 Jan 2023 Xuming Zhang, Jian Yan, Jia Tian, Wei Li, Xingfa Gu, Qingjiu Tian

This framework comprises two main parts: (i) a leakage-free balanced sampling strategy, and (ii) a modified end-to-end fully convolutional network (FCN) architecture that optimizes the trade-off between accuracy and efficiency.

Hyperspectral Image Classification Vocal Bursts Intensity Prediction

RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis

no code implementations CVPR 2023 Xudong Huang, Wei Li, Jie Hu, Hanting Chen, Yunhe Wang

We present Reference-guided Super-Resolution Neural Radiance Field (RefSR-NeRF) that extends NeRF to super resolution and photorealistic novel view synthesis.

Neural Rendering Novel View Synthesis +1

WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning

1 code implementation20 Dec 2022 Wenhao Wu, Wei Li, Xinyan Xiao, Jiachen Liu, Sujian Li, Yajuan Lv

As a result, they perform poorly on the real generated text and are biased heavily by their single-source upstream tasks.

Natural Language Inference Question Answering +2

DCS-RISR: Dynamic Channel Splitting for Efficient Real-world Image Super-Resolution

no code implementations15 Dec 2022 Junbo Qiao, Shaohui Lin, Yunlun Zhang, Wei Li, Jie Hu, Gaoqi He, Changbo Wang, Lizhuang Ma

Real-world image super-resolution (RISR) has received increased focus for improving the quality of SR images under unknown complex degradation.

Image Super-Resolution SSIM

SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from Point Cloud

1 code implementation6 Dec 2022 Yan Wang, Junbo Yin, Wei Li, Pascal Frossard, Ruigang Yang, Jianbing Shen

However, these UDA solutions just yield unsatisfactory 3D detection results when there is a severe domain shift, e. g., from Waymo (64-beam) to nuScenes (32-beam).

3D Object Detection Autonomous Driving +5

Exploring Stochastic Autoregressive Image Modeling for Visual Representation

1 code implementation3 Dec 2022 Yu Qi, Fan Yang, Yousong Zhu, Yufei Liu, Liwei Wu, Rui Zhao, Wei Li

By introducing stochastic prediction and the parallel encoder-decoder, SAIM significantly improve the performance of autoregressive image modeling.

Self-Supervised Learning

MIAD: A Maintenance Inspection Dataset for Unsupervised Anomaly Detection

no code implementations25 Nov 2022 Tianpeng Bao, Jiadong Chen, Wei Li, Xiang Wang, Jingjing Fei, Liwei Wu, Rui Zhao, Ye Zheng

However, existing datasets for unsupervised anomaly detection are biased towards manufacturing inspection, not considering maintenance inspection which is usually conducted under outdoor uncontrolled environment such as varying camera viewpoints, messy background and degradation of object surface after long-term working.

Unsupervised Anomaly Detection

Transformation-Equivariant 3D Object Detection for Autonomous Driving

no code implementations22 Nov 2022 Hai Wu, Chenglu Wen, Wei Li, Xin Li, Ruigang Yang, Cheng Wang

However, it is difficult to apply such networks to 3D object detection in autonomous driving due to its large computation cost and slow reasoning speed.

3D Object Detection Autonomous Driving +3

A Data-driven Case-based Reasoning in Bankruptcy Prediction

no code implementations2 Nov 2022 Wei Li, Wolfgang Karl Härdle, Stefan Lessmann

In addition, we delicately examine the explainability of the CBR system in the decision-making process of bankruptcy prediction.

Decision Making

FRSUM: Towards Faithful Abstractive Summarization via Enhancing Factual Robustness

no code implementations1 Nov 2022 Wenhao Wu, Wei Li, Jiachen Liu, Xinyan Xiao, Ziqiang Cao, Sujian Li, Hua Wu

We first measure a model's factual robustness by its success rate to defend against adversarial attacks when generating factual information.

Abstractive Text Summarization

Jet tagging algorithm of graph network with HaarPooling message passing

no code implementations25 Oct 2022 Fei Ma, Feiyi Liu, Wei Li

In this paper, we introduce an approach of GNNs combined with a HaarPooling operation to analyze the events, called HaarPooling Message Passing neural network (HMPNet).

Jet Tagging

Precisely the Point: Adversarial Augmentations for Faithful and Informative Text Generation

no code implementations22 Oct 2022 Wenhao Wu, Wei Li, Jiachen Liu, Xinyan Xiao, Sujian Li, Yajuan Lyu

Though model robustness has been extensively studied in language understanding, the robustness of Seq2Seq generation remains understudied.

Informativeness Text Generation

Robot Navigation with Reinforcement Learned Path Generation and Fine-Tuned Motion Control

no code implementations19 Oct 2022 Longyuan Zhang, Ziyue Hou, Ji Wang, Ziang Liu, Wei Li

Multiple predictive path points are dynamically generated by a deep Markov model optimized using RL approach for robot to track.

Reinforcement Learning (RL) Robot Navigation

Zero-shot point cloud segmentation by transferring geometric primitives

no code implementations18 Oct 2022 Runnan Chen, Xinge Zhu, Nenglun Chen, Wei Li, Yuexin Ma, Ruigang Yang, Wenping Wang

To this end, we propose a novel framework to learn the geometric primitives shared in seen and unseen categories' objects and employ a fine-grained alignment between language and the learned geometric primitives.

Point Cloud Segmentation Semantic Segmentation

HiSMatch: Historical Structure Matching based Temporal Knowledge Graph Reasoning

no code implementations18 Oct 2022 Zixuan Li, Zhongni Hou, Saiping Guan, Xiaolong Jin, Weihua Peng, Long Bai, Yajuan Lyu, Wei Li, Jiafeng Guo, Xueqi Cheng

This is actually a matching task between a query and candidate entities based on their historical structures, which reflect behavioral trends of the entities at different timestamps.

Relation

Unified Vision and Language Prompt Learning

1 code implementation13 Oct 2022 Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy

Prompt tuning, a parameter- and data-efficient transfer learning paradigm that tunes only a small number of parameters in a model's input space, has become a trend in the vision community since the emergence of large vision-language models like CLIP.

Domain Generalization Few-Shot Learning +1

Stock Trading Volume Prediction with Dual-Process Meta-Learning

1 code implementation11 Oct 2022 Ruibo Chen, Wei Li, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu sun

Our method can model the common pattern behind different stocks with a meta-learner, while modeling the specific pattern for each stock across time spans with stock-dependent parameters.

Algorithmic Trading Meta-Learning

Misaligned orientations of 4f optical neural network for image classification accuracy on various datasets

no code implementations5 Oct 2022 Yanbing Liu, Wei Li, Kun Cheng, Xun Liu, Wei Yang

In order to comprehensively investigate the influence caused by the misalignment, we proposed a method for estimating the performance of a 4f-ONN in response to various misalignment in the context of the image classification task. The misalignment in numerical simulation is estimated by manipulating the optical intensity distributions in the fourth focus plane in the 4f system.

Classification Image Classification

SoccerNet 2022 Challenges Results

7 code implementations5 Oct 2022 Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

2 code implementations28 Sep 2022 Zhiyang Chen, Yousong Zhu, Zhaowen Li, Fan Yang, Wei Li, Haixin Wang, Chaoyang Zhao, Liwei Wu, Rui Zhao, Jinqiao Wang, Ming Tang

Obj2Seq is able to flexibly determine input categories to satisfy customized requirements, and be easily extended to different visual tasks.

Multi-Label Classification Object +2

Open-Ended Diverse Solution Discovery with Regulated Behavior Patterns for Cross-Domain Adaptation

no code implementations24 Sep 2022 Kang Xu, Yan Ma, Bingsheng Wei, Wei Li

While Reinforcement Learning can achieve impressive results for complex tasks, the learned policies are generally prone to fail in downstream tasks with even minor model mismatch or unexpected perturbations.

Domain Adaptation

Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning

no code implementations23 Sep 2022 Kang Xu, Yan Ma, Wei Li

Our key insight is that dynamic systems with different parameters provide different levels of difficulty for the policy, and the difficulty of behaving well in a system is constantly changing due to the evolution of the policy.

Informativeness reinforcement-learning +1

GANet: Goal Area Network for Motion Forecasting

1 code implementation20 Sep 2022 Mingkun Wang, Xinge Zhu, Changqian Yu, Wei Li, Yuexin Ma, Ruochun Jin, Xiaoguang Ren, Dongchun Ren, Mingxu Wang, Wenjing Yang

In view of this, we propose a new goal area-based framework, named Goal Area Network (GANet), for motion forecasting, which models goal areas rather than exact goal coordinates as preconditions for trajectory prediction, performing more robustly and accurately.

Motion Forecasting Trajectory Prediction

Playing Technique Detection by Fusing Note Onset Information in Guzheng Performance

no code implementations19 Sep 2022 Dichucheng Li, Yulun Wu, Qinyu Li, Jiahao Zhao, Yi Yu, Fan Xia, Wei Li

Because each Guzheng playing technique is applied to a note, a dedicated onset detector is trained to divide an audio into several notes and its predictions are fused with frame-wise IPT predictions.

LO-Det: Lightweight Oriented Object Detection in Remote Sensing Images

no code implementations16 Sep 2022 Zhanchao Huang, Wei Li, Xiang-Gen Xia, Hao Wang, Feiran Jie, Ran Tao

Specifically, a channel separation-aggregation (CSA) structure is designed to simplify the complexity of stacked separable convolutions, and a dynamic receptive field (DRF) mechanism is developed to maintain high accuracy by customizing the convolution kernel and its perception range dynamically when reducing the network complexity.

object-detection Object Detection +1

Multi-Grained Angle Representation for Remote Sensing Object Detection

1 code implementation7 Sep 2022 Hao Wang, Zhanchao Huang, Zhengchao Chen, Ying Song, Wei Li

The existing AOOD methods face the challenges of ambiguity and high costs in angle representation.

Object object-detection +2

Language-aware Domain Generalization Network for Cross-Scene Hyperspectral Image Classification

no code implementations6 Sep 2022 Yuxiang Zhang, Mengmeng Zhang, Wei Li, Shuai Wang, Ran Tao

Text information including extensive prior knowledge about land cover classes has been ignored in hyperspectral image classification (HSI) tasks.

Contrastive Learning Domain Generalization +1

Task-wise Sampling Convolutions for Arbitrary-Oriented Object Detection in Aerial Images

1 code implementation6 Sep 2022 Zhanchao Huang, Wei Li, Xiang-Gen Xia, Hao Wang, Ran Tao

However, the inconsistent features for the localization and classification tasks in AOOD models may lead to ambiguity and low-quality object predictions, which constrains the detection performance.

object-detection Object Detection In Aerial Images +1

Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies

1 code implementation3 Sep 2022 Xingrun Xing, Yangguang Li, Wei Li, Wenrui Ding, Yalong Jiang, Yufeng Wang, Jing Shao, Chunlei Liu, Xianglong Liu

Second, to improve the robustness of binary models with contextual dependencies, we compute the contextual dynamic embeddings to determine the binarization thresholds in general binary convolutional blocks.

Binarization Inductive Bias

DeepInteraction: 3D Object Detection via Modality Interaction

2 code implementations23 Aug 2022 Zeyu Yang, Jiaqi Chen, Zhenwei Miao, Wei Li, Xiatian Zhu, Li Zhang

Existing top-performance 3D object detectors typically rely on the multi-modal fusion strategy.

3D Object Detection Object +2

Rethinking Graph Neural Networks for the Graph Coloring Problem

no code implementations15 Aug 2022 Wei Li, Ruxuan Li, Yuzhe ma, Siu On Chan, David Pan, Bei Yu

Graph coloring, a classical and critical NP-hard problem, is the problem of assigning connected nodes as different colors as possible.

Making the Best of Both Worlds: A Domain-Oriented Transformer for Unsupervised Domain Adaptation

1 code implementation2 Aug 2022 Wenxuan Ma, Jinming Zhang, Shuang Li, Chi Harold Liu, Yulin Wang, Wei Li

To alleviate these issues, we propose to simultaneously conduct feature alignment in two individual spaces focusing on different domains, and create for each space a domain-oriented classifier tailored specifically for that domain.

Pseudo Label Unsupervised Domain Adaptation

Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios

4 code implementations12 Jul 2022 Jiashi Li, Xin Xia, Wei Li, Huixia Li, Xing Wang, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan

Then, Next Hybrid Strategy (NHS) is designed to stack NCB and NTB in an efficient hybrid paradigm, which boosts performance in various downstream tasks.

Image Classification

Positive-Negative Equal Contrastive Loss for Semantic Segmentation

no code implementations4 Jul 2022 Jing Wang, Jiangyun Li, Wei Li, Lingfei Xuan, Tianxiang Zhang, Wenxuan Wang

The contextual information is critical for various computer vision tasks, previous works commonly design plug-and-play modules and structural losses to effectively extract and aggregate the global context.

Contrastive Learning Semantic Segmentation

SJ-HD^2R: Selective Joint High Dynamic Range and Denoising Imaging for Dynamic Scenes

no code implementations20 Jun 2022 Wei Li, Shuai Xiao, Tianhong Dai, Shanxin Yuan, Tao Wang, Cheng Li, Fenglong Song

To further leverage these two paradigms, we propose a selective and joint HDR and denoising (SJ-HD$^2$R) imaging framework, utilizing scenario-specific priors to conduct the path selection with an accuracy of more than 93. 3$\%$.

Denoising

Masked Frequency Modeling for Self-Supervised Visual Pre-Training

3 code implementations15 Jun 2022 Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

We present Masked Frequency Modeling (MFM), a unified frequency-domain-based approach for self-supervised pre-training of visual models.

Image Classification Image Restoration +2

Toward Real-world Single Image Deraining: A New Benchmark and Beyond

1 code implementation11 Jun 2022 Wei Li, Qiming Zhang, Jing Zhang, Zhen Huang, Xinmei Tian, DaCheng Tao

To address these issues, we establish a new high-quality dataset named RealRain-1k, consisting of $1, 120$ high-resolution paired clean and rainy images with low- and high-density rain streaks, respectively.

Domain Generalization Image Restoration +2

MPANet: Multi-Patch Attention For Infrared Small Target object Detection

no code implementations5 Jun 2022 Ao Wang, Wei Li, Xin Wu, Zhanchao Huang, Ran Tao

To this end, a multi-patch attention network (MPANet) based on the axial-attention encoder and the multi-scale patch branch (MSPB) structure is proposed.

object-detection Object Detection

Benchmarking Unsupervised Anomaly Detection and Localization

no code implementations30 May 2022 Ye Zheng, Xiang Wang, Yu Qi, Wei Li, Liwei Wu

From the time the MVTec AD dataset was proposed to the present, new research methods that are constantly being proposed push its precision to saturation.

Benchmarking Unsupervised Anomaly Detection

Do We Really Need to Use Constraint Violation in Constrained Evolutionary Multi-Objective Optimization?

no code implementations28 May 2022 Shuang Li, Ke Li, Wei Li

Constraint violation has been a building block to design evolutionary multi-objective optimization algorithms for solving constrained multi-objective optimization problems.

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations25 May 2022 Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

A Survey on Hyperspectral Image Restoration: From the View of Low-Rank Tensor Approximation

no code implementations18 May 2022 Na Liu, Wei Li, Yinjian Wang, Rao Tao, Qian Du, Jocelyn Chanussot

The ability of capturing fine spectral discriminative information enables hyperspectral images (HSIs) to observe, detect and identify objects with subtle spectral discrepancy.

Deblurring Denoising +2

$(O,G)$-granular variable precision fuzzy rough sets based on overlap and grouping functions

no code implementations18 May 2022 Wei Li, Bin Yang, Junsheng Qiao

In this paper, the depiction of $(O, G)$-granular variable precision fuzzy rough sets ($(O, G)$-GVPFRSs for short) is first given based on overlap and grouping functions.

Some neighborhood-related fuzzy covering-based rough set models and their applications for decision making

no code implementations13 May 2022 Gongao Qi, Bin Yang, Wei Li

In order to further generalize the FRS theory to more complicated data environments, we firstly propose four types of fuzzy neighborhood operators based on fuzzy covering by overlap functions and their implicators in this paper.

Decision Making

On three types of $L$-fuzzy $β$-covering-based rough sets

no code implementations13 May 2022 Wei Li, Bin Yang, Junsheng Qiao

In this paper, we mainly construct three types of $L$-fuzzy $\beta$-covering-based rough set models and study the axiom sets, matrix representations and interdependency of these three pairs of $L$-fuzzy $\beta$-covering-based rough approximation operators.

valid

Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering

no code implementations2 May 2022 AJ Piergiovanni, Wei Li, Weicheng Kuo, Mohammad Saffar, Fred Bertsch, Anelia Angelova

We present Answer-Me, a task-aware multi-task framework which unifies a variety of question answering tasks, such as, visual question answering, visual entailment, visual reasoning.

Image Captioning Question Answering +4

HarmoF0: Logarithmic Scale Dilated Convolution For Pitch Estimation

1 code implementation2 May 2022 Weixing Wei, Peilin Li, Yi Yu, Wei Li

Sounds, especially music, contain various harmonic components scattered in the frequency dimension.

Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

no code implementations24 Apr 2022 Yanxiong Li, Wucheng Wang, Hao Chen, Wenchang Cao, Wei Li, Qianhua He

Although few-shot learning has attracted much attention from the fields of image and audio classification, few efforts have been made on few-shot speaker identification.

Audio Classification Few-Shot Learning +1

Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation

1 code implementation Findings (NAACL) 2022 Yong Cao, Wei Li, Xianzhi Li, Min Chen, Guangyong Chen, Long Hu, Zhengdao Li, Hwang Kai

Sign language recognition and translation first uses a recognition module to generate glosses from sign language videos and then employs a translation module to translate glosses into spoken sentences.

Data Augmentation Sign Language Recognition +2

A3CLNN: Spatial, Spectral and Multiscale Attention ConvLSTM Neural Network for Multisource Remote Sensing Data Classification

no code implementations9 Apr 2022 Heng-Chao Li, Wen-Shuai Hu, Wei Li, Jun Li, Qian Du, Antonio Plaza

The problem of effectively exploiting the information multiple data sources has become a relevant but challenging research topic in remote sensing.

Transfer Learning

MS-HLMO: Multi-scale Histogram of Local Main Orientation for Remote Sensing Image Registration

no code implementations1 Apr 2022 Chenzhong Gao, Wei Li, Ran Tao, Qian Du

Considering the characteristics and differences of multi-source remote sensing images, a feature-based registration algorithm named Multi-scale Histogram of Local Main Orientation (MS-HLMO) is proposed.

Image Registration

MeMOT: Multi-Object Tracking with Memory

no code implementations CVPR 2022 Jiarui Cai, Mingze Xu, Wei Li, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto

We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after a long time span.

Multi-Object Tracking Object +2

FindIt: Generalized Localization with Natural Language Queries

no code implementations31 Mar 2022 Weicheng Kuo, Fred Bertsch, Wei Li, AJ Piergiovanni, Mohammad Saffar, Anelia Angelova

We propose FindIt, a simple and versatile framework that unifies a variety of visual grounding and localization tasks including referring expression comprehension, text-based localization, and object detection.

Natural Language Queries Object +5

Open-Vocabulary DETR with Conditional Matching

1 code implementation22 Mar 2022 Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy

To this end, we propose a novel open-vocabulary detector based on DETR -- hence the name OV-DETR -- which, once trained, can detect any object given its class name or an exemplar image.

Language Modelling object-detection +1

UNIMO-2: End-to-End Unified Vision-Language Grounded Learning

1 code implementation Findings (ACL) 2022 Wei Li, Can Gao, guocheng niu, Xinyan Xiao, Hao liu, Jiachen Liu, Hua Wu, Haifeng Wang

In particular, we propose to conduct grounded learning on both images and texts via a sharing grounded space, which helps bridge unaligned images and texts, and align the visual and textual semantic spaces on different types of corpora.

1,683