no code implementations • AACL (NLP-TEA) 2020 • Yi Wang, Ruibin Yuan, Yan‘gen Luo, Yufang Qin, NianYong Zhu, Peng Cheng, Lihuan Wang
A better Chinese Grammatical Error Diagnosis (CGED) system for automatic Grammatical Error Correction (GEC) can benefit foreign Chinese learners and lower Chinese learning barriers.
1 code implementation • ACL 2022 • Yupian Lin, Tong Ruan, Ming Liang, Tingting Cai, Wen Du, Yi Wang
Secondly, the tool provides annotation of events, nested event and nested entity, which are frequently required in domain-related text structuring tasks.
1 code implementation • 18 Mar 2025 • Chenting Wang, Kunchang Li, Tianxiang Jiang, Xiangyu Zeng, Yi Wang, LiMin Wang
By making the sampling grid flexible and leveraging token selection, it is easily adopted in most popular video training frameworks, boosting model robustness with nearly no additional cost.
1 code implementation • 13 Mar 2025 • Leonard Waldmann, Ando Shah, Yi Wang, Nils Lehmann, Adam J. Stewart, Zhitong Xiong, Xiao Xiang Zhu, Stefan Bauer, John Chuang
Earth observation (EO) data features diverse sensing platforms with varying spectral bands, spatial resolutions, and sensing modalities.
1 code implementation • 8 Mar 2025 • Zhitong Xiong, Yi Wang, Weikang Yu, Adam J Stewart, Jie Zhao, Nils Lehmann, Thomas Dujardin, Zhenghang Yuan, Pedram Ghamisi, Xiao Xiang Zhu
Earth observation (EO) data, collected from diverse sensors with varying imaging principles, present significant challenges in creating unified analytical frameworks.
no code implementations • 7 Mar 2025 • Guanghao Zhang, Tao Zhong, Yan Xia, Zhelun Yu, Haoyuan Li, Wanggui He, Fangxun Shu, Mushui Liu, Dong She, Yi Wang, Hao Jiang
The construction of interleaved multimodal multi-step reasoning chains, which utilize critical visual region tokens, extracted from intermediate reasoning steps, as supervisory signals.
no code implementations • 27 Feb 2025 • Guanting Liu, Yi Wang, Xi Chen, Baoyu Du, Penglei Li, Yuan Wu, Zhice Fang
However, DL models rely heavily on high-quality labeled landslide data for strong feature extraction capabilities.
no code implementations • 25 Feb 2025 • Zhuoran Lu, Qian Zhou, Yi Wang
Generative AI significantly enhances player agency in interactive narratives (IN) by enabling just-in-time content generation that adapts to player actions.
1 code implementation • 21 Feb 2025 • Junliang Chen, Huaiyuan Xu, Yi Wang, Lap-Pui Chau
OccProphet reduces 58\%$\sim$78\% of the computational cost with a 2. 6$\times$ speedup compared with the state-of-the-art Cam4DOcc.
no code implementations • 17 Feb 2025 • Jiecheng Zhou, Ding Tang, Rong Fu, Boni Hu, Haoran Xu, Yi Wang, Zhilin Pei, Zhongling Su, Liang Liu, Xingcheng Zhang, Weiming Zhang
The burgeoning computational demands for training large language models (LLMs) necessitate efficient methods, including quantized training, which leverages low-bit arithmetic operations to reduce costs.
no code implementations • 17 Feb 2025 • Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang
Large Language Models (LLMs) are widely applied in decision making, but their deployment is threatened by jailbreak attacks, where adversarial users manipulate model behavior to bypass safety measures.
no code implementations • 10 Feb 2025 • D. She, Mushui Liu, Jingxuan Pang, Jin Wang, Zhen Yang, Wanggui He, Guanghao Zhang, Yi Wang, Qihan Huang, Haobin Tang, Yunlong Yu, Siming Fu
Customized generation has achieved significant progress in image synthesis, yet personalized video generation remains challenging due to temporal inconsistencies and quality degradation.
no code implementations • 5 Feb 2025 • Peiyan Yue, Die Cai, Chu Guo, Mengxing Liu, Jun Xia, Yi Wang
Accurate automated segmentation of tibial plateau fractures (TPF) from computed tomography (CT) requires large amounts of annotated data to train deep learning models, but obtaining such annotations presents unique challenges.
no code implementations • 26 Jan 2025 • Ying Zheng, Yiyi Zhang, Yi Wang, Lap-Pui Chau
Source-free domain adaptation in visual emotion recognition (SFDA-VER) is a highly challenging task that requires adapting VER models to the target domain without relying on source data, which is of great significance for data privacy protection.
1 code implementation • 21 Jan 2025 • Yi Wang, Xinhao Li, Ziang Yan, Yinan He, Jiashuo Yu, Xiangyu Zeng, Chenting Wang, Changlian Ma, Haian Huang, Jianfei Gao, Min Dou, Kai Chen, Wenhai Wang, Yu Qiao, Yali Wang, LiMin Wang
This paper aims to improve the performance of video multimodal large language models (MLLM) via long and rich context (LRC) modeling.
Ranked #11 on
Referring Video Object Segmentation
on MeViS
no code implementations • 17 Jan 2025 • Xiaohui Li, Yihao Liu, Shuo Cao, Ziyan Chen, Shaobin Zhuang, Xiangyu Chen, Yinan He, Yi Wang, Yu Qiao
Diffusion models have demonstrated exceptional capabilities in image generation and restoration, yet their application to video super-resolution faces significant challenges in maintaining both high fidelity and temporal consistency.
1 code implementation • 14 Jan 2025 • Weichen Fan, Chenyang Si, Junhao Song, Zhenyu Yang, Yinan He, Long Zhuo, Ziqi Huang, Ziyue Dong, Jingwen He, Dongwei Pan, Yi Wang, Yuming Jiang, Yaohui Wang, Peng Gao, Xinyuan Chen, Hengjie Li, Dahua Lin, Yu Qiao, Ziwei Liu
We present Vchitect-2. 0, a parallel transformer architecture designed to scale up video diffusion models for large-scale text-to-video generation.
no code implementations • 13 Jan 2025 • Tieyuan Chen, Huabin Liu, Yi Wang, Yihang Chen, Tianyao He, Chaofan Gan, Huanyu He, Weiyao Lin
Given visual segments and textual descriptions of events, MECD identifies the causal associations between these events to derive a comprehensive and structured event-level video causal graph explaining why and how the result event occurred.
1 code implementation • 7 Jan 2025 • Zetian Feng, Dong Ni, Yi Wang
The registration of magnetic resonance (MR) and transrectal ultrasound (TRUS) can provide guidance for the targeted biopsy of prostate cancer.
no code implementations • 7 Jan 2025 • Ying Chen, Rami Al-Maskari, Izabela Horvath, Mayar Ali, Luciano Hoher, Kaiyuan Yang, Zengming Lin, Zhiwei Zhai, Mengzhe Shen, Dejin Xun, Yi Wang, Tony Xu, Maged Goubran, Yunheng Wu, Kensaku MORI, Johannes C. Paetzold, Ali Erturk
Combined with the progress in large-scale data analysis, driven by deep learning, these innovations empower researchers to rapidly investigate the morphological and functional properties of diverse biological samples.
no code implementations • 7 Jan 2025 • Le Yang, Siyang Gao, Cheng Li, Yi Wang
We consider the problem of the best arm identification in the presence of stochastic constraints, where there is a finite number of arms associated with multiple performance measures.
no code implementations • 4 Jan 2025 • Yangze Zhou, Guoxin Lin, Gonghao Zhang, Yi Wang
However, the difference in MF collected in various locations within a region may be significant, which poses a challenge in selecting the appropriate MF from numerous locations.
2 code implementations • 31 Dec 2024 • Xinhao Li, Yi Wang, Jiashuo Yu, Xiangyu Zeng, Yuhan Zhu, Haian Huang, Jianfei Gao, Kunchang Li, Yinan He, Chenting Wang, Yu Qiao, Yali Wang, LiMin Wang
This paper introduces a Hierarchical visual token Compression (HiCo) method designed for high-fidelity representation and a practical context modeling system VideoChat-Flash tailored for multimodal long-sequence processing.
1 code implementation • 26 Dec 2024 • Ziang Yan, Zhilin Li, Yinan He, Chenting Wang, Kunchang Li, Xinhao Li, Xiangyu Zeng, Zilei Wang, Yali Wang, Yu Qiao, LiMin Wang, Yi Wang
Current multimodal large language models (MLLMs) struggle with fine-grained or precise understanding of visuals though they give comprehensive perception and reasoning in a spectrum of vision applications.
no code implementations • 23 Dec 2024 • Yang Xu, Yi Wang, Hao Wang
Understanding training dynamics and feature evolution is crucial for the mechanistic interpretability of large language models (LLMs).
no code implementations • 22 Dec 2024 • Leidong Xu, Danyal Mohaddes, Yi Wang
FoamPilot provides three core functionalities: code insight, case configuration and simulation evaluation.
no code implementations • 13 Dec 2024 • Ha Luu, Mert Sisman, Ilhami Kovanlikaya, Tam Vu, Pascal Spincemaille, Yi Wang, Francesca Bagnato, Susan Gauthier, Thanh Nguyen
Deep learning-based QSM-RimNet can provide automated PRL detection, but this method does not provide rim segmentation for microglial density quantification and requires precise QSM lesion masks.
1 code implementation • 11 Dec 2024 • Zun Wang, Jialu Li, Yicong Hong, Songze Li, Kunchang Li, Shoubin Yu, Yi Wang, Yu Qiao, Yali Wang, Mohit Bansal, LiMin Wang
In this paper, we introduce a Self-Refining Data Flywheel (SRDF) that generates high-quality and large-scale navigational instruction-trajectory pairs by iteratively refining the data pool through the collaboration between two models, the instruction generator and the navigator, without any human-in-the-loop annotation.
1 code implementation • 6 Dec 2024 • Zhe Chen, Weiyun Wang, Yue Cao, Yangzhou Liu, Zhangwei Gao, Erfei Cui, Jinguo Zhu, Shenglong Ye, Hao Tian, Zhaoyang Liu, Lixin Gu, Xuehui Wang, Qingyun Li, Yimin Ren, Zixuan Chen, Jiapeng Luo, Jiahao Wang, Tan Jiang, Bo wang, Conghui He, Botian Shi, Xingcheng Zhang, Han Lv, Yi Wang, Wenqi Shao, Pei Chu, Zhongying Tu, Tong He, Zhiyong Wu, Huipeng Deng, Jiaye Ge, Kai Chen, Kaipeng Zhang, LiMin Wang, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang
We introduce InternVL 2. 5, an advanced multimodal large language model (MLLM) series that builds upon InternVL 2. 0, maintaining its core model architecture while introducing significant enhancements in training and testing strategies as well as data quality.
Ranked #1 on
Video Question Answering
on NExT-QA
1 code implementation • 1 Dec 2024 • Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao
Recent DETR-based methods have advanced the development of Video Instance Segmentation (VIS) through transformers' efficiency and capability in modeling spatial and temporal information.
no code implementations • 22 Nov 2024 • Yi Wang, Jiaze Wang, Ziyu Guo, Renrui Zhang, Donghao Zhou, Guangyong Chen, Anfeng Liu, Pheng-Ann Heng
Recently Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms, however, these methods often overlook latent information in less prominent regions, leading to increased sensitivity to perturbations and limited global comprehension.
no code implementations • 22 Nov 2024 • Yunxia Wu, Le Li, Zhihong Yao, Yi Wang, Gen Li, Yangsheng Jiang
Finally, numerical experiments were conducted to calculate the average fuel consumption and pollutant emissions of mixed traffic flow under different spacing control strategies, and the impact of platoon spacing control strategies on traffic flow fuel consumption and pollutant emissions was further analyzed.
no code implementations • 18 Nov 2024 • Chuang Yang, Xu Han, Tao Han, Yuejiao Su, Junyu Gao, Hongyuan Zhang, Yi Wang, Lap-Pui Chau
Meanwhile, we develop a traffic guidance assistant (TGA) scenario application to re-explore the role of traffic signs in ADS as a complement to popular autonomous technologies (such as obstacle perception).
no code implementations • 15 Nov 2024 • Kedi Zheng, Qixin Chen, Yi Wang, Chongqing Kang, Le Xie
The congestion part of LMPs is spanned by certain row vectors of the power transfer distribution factor (PTDF) matrix, and the subspace attributes of an LMP vector uniquely are found to reflect the instantaneous congestion status of all the transmission lines.
no code implementations • 11 Nov 2024 • Kedi Zheng, Qixin Chen, Yi Wang, Chongqing Kang, Qing Xia
One technique is the Maximum Information Coefficient (MIC), which can find the correlations between the non-technical loss (NTL) and a certain electricity behavior of the consumer.
no code implementations • 1 Nov 2024 • Kedi Zheng, Hanwei Xu, Zeyang Long, Yi Wang, Qixin Chen
The growing penetration of electric vehicles (EVs) significantly changes typical load curves in smart grids.
1 code implementation • 28 Oct 2024 • Wenyang Liu, Kejun Wu, Tianyi Liu, Yi Wang, Kim-Hui Yap, Lap-Pui Chau
By looking inside bytes, the bit-level details of file fragments can be accessed, enabling a more accurate classification.
no code implementations • 25 Oct 2024 • Xiangyu Zeng, Kunchang Li, Chenting Wang, Xinhao Li, Tianxiang Jiang, Ziang Yan, Songze Li, Yansong Shi, Zhengrong Yue, Yi Wang, Yali Wang, Yu Qiao, LiMin Wang
This paper proposes TimeSuite, a collection of new designs to adapt the existing short-form video MLLMs for long video understanding, including a simple yet efficient framework to process long video sequence, a high-quality video dataset for grounded tuning of MLLMs, and a carefully-designed instruction tuning task to explicitly incorporate the grounding supervision in the traditional QA format.
Ranked #7 on
Moment Retrieval
on Charades-STA
(using extra training data)
1 code implementation • 21 Oct 2024 • Ke Zhang, Junjie Li, Shuai Wang, Yangjie Wei, Yi Wang, Yannan Wang, Haizhou Li
In this work, we propose a multi-level speaker representation approach, from raw features to neural embeddings, to serve as the speaker reference cue.
no code implementations • 19 Oct 2024 • Chao Li, Jiahao Li, Jinwei Zhang, Eddy Solomon, Alexey V. Dimov, Pascal Spincemaille, Thanh D. Nguyen, Martin R. Prince, Yi Wang
Purpose: To develop an MRI technique for free-breathing 3D whole-liver quantification of water T1, water T2, proton density fat fraction (PDFF), R2*.
no code implementations • 15 Oct 2024 • Yiming Li, Yi Wang, Wenqian Wang, Dan Lin, Bingbing Li, Kim-Hui Yap
Exploring new knowledge is a fundamental human ability that can be mirrored in the development of deep neural networks, especially in the field of object detection.
1 code implementation • 12 Oct 2024 • Haiqiao Wang, Dong Ni, Yi Wang
In FiRework, we redesign the continuous deformation framework to mitigate the aforementioned errors.
1 code implementation • 9 Oct 2024 • Zhixian Wang, Linxiao Yang, Liang Sun, Qingsong Wen, Yi Wang
Time series analysis is widely used in many fields such as power energy, economics, and transportation, including different tasks such as forecasting, anomaly detection, classification, etc.
no code implementations • 7 Oct 2024 • Yongming Chen, Miner Chen, Ye Zhu, Juan Pei, Siyu Chen, Yu Zhou, Yi Wang, Yifan Zhou, Hao Li, Songan Zhang
To address this issue, we propose an efficient law article recommendation approach utilizing a Knowledge Graph (KG) and a Large Language Model (LLM).
1 code implementation • 29 Sep 2024 • Dalin Qin, Yehui Li, Weiqi Chen, Zhaoyang Zhu, Qingsong Wen, Liang Sun, Pierre Pinson, Yi Wang
Complex distribution shifts are the main obstacle to achieving accurate long-term time series forecasting.
1 code implementation • 24 Sep 2024 • Yahan Li, Yi Wang, Yi Chang, Yuan Wu
Large language models (LLMs) have demonstrated remarkable capabilities across a range of natural language processing (NLP) tasks, capturing the attention of both practitioners and the broader public.
no code implementations • 23 Sep 2024 • Junyu Xue, Yu Zhang, Xudong Wang, Yi Wang, Guoming Tang
Non-intrusive load monitoring (NILM), as a key load monitoring technology, can much reduce the deployment cost of traditional power sensors.
1 code implementation • 7 Sep 2024 • Zimu Liao, Siyan Chen, Rong Fu, Yi Wang, Zhongling Su, Hao Luo, Li Ma, Linning Xu, Bo Dai, Hengjie Li, Zhilin Pei, Xingcheng Zhang
However, adapting 3DGS to different camera models, particularly fisheye lenses, poses challenges due to the unique 3D to 2D projection calculation.
1 code implementation • 21 Aug 2024 • Zhenye Lou, Qing Xu, Zekun Jiang, Xiangjian He, Zhen Chen, Yi Wang, Chenxin Li, Maggie M. He, Wenting Duan
To alleviate the labor-intensive requirement of manual prompts, we introduce a Gaussian-Kernel Prompt Encoder (GKP-Encoder) to generate density maps driven by a single point, which guides segmentation predictions by mixing position prompts and semantic prompts.
1 code implementation • 21 Aug 2024 • Ying Zheng, Lei Yao, Yuejiao Su, Yi Zhang, Yi Wang, Sicheng Zhao, Yiyi Zhang, Lap-Pui Chau
Embodied learning for object-centric robotic manipulation is a rapidly developing and challenging area in embodied AI.
no code implementations • 21 Aug 2024 • Yiquan Wu, Bo Tang, Chenyang Xi, Yu Yu, Pengyu Wang, Yifei Liu, Kun Kuang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Jie Hu, Peng Cheng, Zhonghao Wang, Yi Wang, Yi Luo, MingChuan Yang
To address the advanced requirements, we present an argument ranking model for arguments and establish a comprehensive evidence database that includes up-to-date events and classic books, thereby strengthening the substantiation of the evidence with retrieval augmented generation (RAG) technology.
no code implementations • 21 Aug 2024 • Yi Wang, Jian Ma, Ruizhi Shao, Qiao Feng, Yu-Kun Lai, Kun Li
Specifically, to achieve layer-wise clothing generation, we propose a dual-representation decoupling framework for generating clothing decoupled from the human body, in conjunction with an innovative multi-layer fusion volume rendering method.
1 code implementation • 18 Aug 2024 • Haoxin Yang, Xuemiao Xu, Cheng Xu, Huaidong Zhang, Jing Qin, Yi Wang, Pheng-Ann Heng, Shengfeng He
This paper introduces G\textsuperscript{2}Face, which leverages both generative and geometric priors to enhance identity manipulation, achieving high-quality reversible face anonymization without compromising data utility.
2 code implementations • 17 Aug 2024 • Jiawei Lian, Jianhong Pan, Lefan Wang, Yi Wang, Lap-Pui Chau, Shaohui Mei
Moreover, physical dynamics and cross-domain transformation are challenging to strictly regulate in the real world, leading to unaligned evaluation and comparison, severely hindering the development of physically robust models.
1 code implementation • 15 Aug 2024 • Guofeng Feng, Siyan Chen, Rong Fu, Zimu Liao, Yi Wang, Tao Liu, Zhilin Pei, Hengjie Li, Xingcheng Zhang, Bo Dai
This work introduces FlashGS, an open-source CUDA Python library, designed to facilitate the efficient differentiable rasterization of 3D Gaussian Splatting through algorithmic and kernel-level optimizations.
1 code implementation • 15 Aug 2024 • Nassim Ait Ali Braham, Conrad M Albrecht, Julien Mairal, Jocelyn Chanussot, Yi Wang, Xiao Xiang Zhu
To close this gap, we introduce SpectralEarth, a large-scale multi-temporal dataset designed to pretrain hyperspectral foundation models leveraging data from the Environmental Mapping and Analysis Program (EnMAP).
1 code implementation • 25 Jul 2024 • Yatao Zhang, Yi Wang, Song Gao, Martin Raubal
This study proposes a novel context-aware knowledge graph (CKG) framework to enhance traffic speed forecasting by effectively modeling spatial and temporal contexts.
1 code implementation • 19 Jul 2024 • Qing Xu, Jiaxuan Li, Xiangjian He, Ziyu Liu, Zhen Chen, Wenting Duan, Chenxin Li, Maggie M. He, Fiseha B. Tesema, Wooi P. Cheah, Yi Wang, Rong Qu, Jonathan M. Garibaldi
Finally, we design the Query-Decoupled Modality Decoder (QDMD) that leverages a one-to-one strategy to provide an independent decoding channel for every modality.
1 code implementation • 19 Jul 2024 • Xun Liang, Shichao Song, Zifan Zheng, Hanyu Wang, Qingchen Yu, Xunkai Li, Rong-Hua Li, Yi Wang, Zhonghao Wang, Feiyu Xiong, Zhiyu Li
In this paper, we use a unified perspective of internal consistency, offering explanations for reasoning deficiencies and hallucinations.
1 code implementation • 18 Jul 2024 • Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao
To bridge the gap between image and video, in this work, we propose a new video segmentation task - video reasoning segmentation.
1 code implementation • 16 Jul 2024 • Lei Yao, Yi Wang, Moyun Liu, Lap-Pui Chau
Specifically, the principle of our SMQ initialization scheme is to leverage the predicted voxel-wise semantic information to implicitly generate the scene-aware query, yielding adequate scene prior and compensating for the learnable query set.
1 code implementation • 10 Jul 2024 • Wanggui He, Siming Fu, Mushui Liu, Xierui Wang, Wenyi Xiao, Fangxun Shu, Yi Wang, Lei Zhang, Zhelun Yu, Haoyuan Li, Ziwei Huang, Leilei Gan, Hao Jiang
Auto-regressive models have made significant progress in the realm of language generation, yet they do not perform on par with diffusion models in the domain of image synthesis.
1 code implementation • 8 Jul 2024 • Yuejiao Su, Yi Wang, Lap-Pui Chau
Egocentric hand-object segmentation (EgoHOS) is a promising new task aiming at segmenting hands and interacting objects in egocentric images.
no code implementations • 3 Jul 2024 • Shujie Hu, Xurong Xie, Mengzhe Geng, Zengrui Jin, Jiajun Deng, Guinan Li, Yi Wang, Mingyu Cui, Tianzi Wang, Helen Meng, Xunying Liu
Experiments are conducted on four tasks: the English UASpeech and TORGO dysarthric speech corpora; and the English DementiaBank Pitt and Cantonese JCCOCC MoCA elderly speech datasets.
no code implementations • 30 Jun 2024 • Haiqiao Wang, Hong Wu, Zhuoyuan Wang, Peiyan Yue, Dong Ni, Pheng-Ann Heng, Yi Wang
In consequence, this survey provides a \textcolor{blue}{narrative } analysis of this field, outlining the evolution of image processing methods in the context of TRUS image analysis and meanwhile highlighting their relevant contributions.
1 code implementation • 20 Jun 2024 • Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zou, Jianhua Zhou, Yi Wang
In this study, we propose a novel learning framework for clinically significant prostate cancer (csPCa) classification using multi-modality TRUS.
1 code implementation • 18 Jun 2024 • Xiaoqi Wang, Yi Wang, Lap-Pui Chau
In this report, we present our champion solution for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge in CVPR 2024.
Ranked #1 on
Multi-Instance Retrieval
on EPIC-KITCHENS-100
1 code implementation • 18 Jun 2024 • Zhuoyuan Wang, Haiqiao Wang, Yi Wang
Most existing deep learning-based registration methods are trained on single-type images to address same-domain tasks. However, cross-domain deformable registration remains challenging. We argue that the tailor-made matching criteria in traditional registration methods is one of the main reason they are applicable in different domains. Motivated by this, we devise a registration-oriented encoder to model the matching criteria of image features and structural features, which is beneficial to boost registration accuracy and adaptability. Specifically, a general feature encoder (Encoder-G) is proposed to capture comprehensive medical image features, while a structural feature encoder (Encoder-S) is designed to encode the structural self-similarity into the global representation. Extensive experiments on images from three different domains prove the efficacy of the proposed method.
1 code implementation • 12 Jun 2024 • Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang Jin, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu, Tong Lu, Yali Wang, LiMin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He, Jifeng Dai
In this paper, we introduce OmniCorpus, a 10 billion-scale image-text interleaved dataset.
no code implementations • 10 Jun 2024 • Mengshuo Jia, Gabriela Hug, Ning Zhang, Zhaojian Wang, Yi Wang, Chongqing Kang
Subsequently, this paper evaluates a total of 44 methods, containing over 30 existing DPFL approaches, some innovative DPFL techniques, and several classic physics-driven power flow linearization methods for benchmarking.
no code implementations • 10 Jun 2024 • Mengshuo Jia, Gabriela Hug, Ning Zhang, Zhaojian Wang, Yi Wang, Chongqing Kang
Further, this tutorial implements extensive numerical comparisons of all existing DPFL methods (40 methods in total) and four classic physics-driven approaches, focusing on their generalizability, applicability, accuracy, and computational efficiency.
1 code implementation • 30 May 2024 • Yi Wang, Conrad M Albrecht, Xiao Xiang Zhu
Second, we revisit and explore cross-domain continual pretraining for both multispectral and SAR imagery, building efficient EO foundation models from strongest vision models such as DINOv2.
no code implementations • 28 May 2024 • Jiaze Wang, Yi Wang, Ziyu Guo, Renrui Zhang, Donghao Zhou, Guangyong Chen, Anfeng Liu, Pheng-Ann Heng
We introduce MM-Mixing, a multi-modal mixing alignment framework for 3D understanding.
no code implementations • 26 May 2024 • Chao Li, Jinwei Zhang, Hang Zhang, Jiahao Li, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
Purpose: To develop a pipeline for motion artifact correction in mGRE and quantitative susceptibility mapping (QSM).
no code implementations • 17 May 2024 • Yi Wang, Qian Zhou, David Ledo
The process creates "living stories" that dynamically adapt to various game world states, resulting in narratives co-created by the author, character simulation, and player.
no code implementations • 9 May 2024 • Sheng Yan, Xin Du, Zongying Li, Yi Wang, Hongcang Jin, Mengyuan Liu
Temporal grounding is crucial in multimodal learning, but it poses challenges when applied to animal behavior data due to the sparsity and uniform distribution of moments.
1 code implementation • 8 May 2024 • Huaiyuan Xu, Junliang Chen, Shiyu Meng, Yi Wang, Lap-Pui Chau
A comprehensive list of studies in this survey is publicly available in an active repository that continuously collects the latest work: https://github. com/HuaiyuanXu/3D-Occupancy-Perception.
no code implementations • 7 May 2024 • Xiao Xiang Zhu, Zhitong Xiong, Yi Wang, Adam J. Stewart, Konrad Heidler, Yuanyuan Wang, Zhenghang Yuan, Thomas Dujardin, Qingsong Xu, Yilei Shi
Foundation models have enormous potential in advancing Earth and climate sciences, however, current approaches may not be optimal as they focus on a few basic features of a desirable Earth and climate foundation model.
no code implementations • 2 May 2024 • Chenying Liu, Conrad Albrecht, Yi Wang, Xiao Xiang Zhu
We study the potential of noisy labels y to pretrain semantic segmentation models in a multi-modal learning framework for geospatial applications.
1 code implementation • 2 May 2024 • Yu Zhou, Xiahao Zou, Yi Wang
The proposed method achieves an average DSC of 93. 36%, ASSD of 0. 85mm, 95HD of 7. 51mm.
no code implementations • 30 Apr 2024 • Yi Li, Renyou Xie, Chaojie Li, Yi Wang, ZhaoYang Dong
To address these challenges, a federated graph learning approach involving multiple charging stations is proposed to collaboratively train a more generalized deep learning model for demand forecasting while capturing spatial correlations among various stations and enhancing robustness against potential attacks.
1 code implementation • 29 Apr 2024 • Yiyuan Yang, Ming Jin, Haomin Wen, Chaoli Zhang, Yuxuan Liang, Lintao Ma, Yi Wang, Chenghao Liu, Bin Yang, Zenglin Xu, Jiang Bian, Shirui Pan, Qingsong Wen
Conditioned models, on the other hand, utilize extra information to enhance performance and are similarly divided for both predictive and generative tasks.
1 code implementation • 28 Apr 2024 • Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo
Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
1 code implementation • 22 Apr 2024 • Zixian Li, Yuming Zhong, Yi Wang
In this study, we propose to fully leverage the dynamic characteristics from the kinetic curves as well as the radiomic features to boost the classification accuracy of benign and malignant breast lesions.
no code implementations • 21 Apr 2024 • Cai Chen, Runzhong Zhang, Jianjun Gao, Kejun Wu, Kim-Hui Yap, Yi Wang
Temporal sentence grounding involves the retrieval of a video moment with a natural language query.
no code implementations • 8 Apr 2024 • Su-Xi Yu, Jing-Yuan He, Yi Wang, Yu-Jiao Cai, Jun Yang, Bo Lin, Wei-Bin Yang, Jian Ruan
Graves' disease is a common condition that is diagnosed clinically by determining the smoothness of the thyroid texture and its morphology in ultrasound images.
1 code implementation • 2 Apr 2024 • Zhuoyuan Wang, Dong Sun, Xiangyun Zeng, Ruodai Wu, Yi Wang
Accordingly, we propose a contextual embedding learning approach to facilitate 2D CNNs capturing spatial information properly.
2 code implementations • 25 Mar 2024 • Haiqiao Wang, Zhuoyuan Wang, Dong Ni, Yi Wang
Deformable image registration plays a crucial role in medical imaging, aiding in disease diagnosis and image-guided interventions.
1 code implementation • 23 Mar 2024 • Huiping Zhuang, Yuchen Liu, Run He, Kai Tong, Ziqian Zeng, Cen Chen, Yi Wang, Lap-Pui Chau
In this paper, we propose an exemplar-free approach--Forward-only Online Analytic Learning (F-OAL).
2 code implementations • 22 Mar 2024 • Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Chenting Wang, Guo Chen, Baoqi Pei, Ziang Yan, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei HUANG, Yu Qiao, Yali Wang, LiMin Wang
We introduce InternVideo2, a new family of video foundation models (ViFM) that achieve the state-of-the-art results in video recognition, video-text tasks, and video-centric dialogue.
Ranked #1 on
Action Classification
on MIT
1 code implementation • 22 Mar 2024 • Zhitong Xiong, Yi Wang, Fahong Zhang, Adam J. Stewart, Joëlle Hanna, Damian Borth, Ioannis Papoutsis, Bertrand Le Saux, Gustau Camps-Valls, Xiao Xiang Zhu
The development of foundation models has revolutionized our ability to interpret the Earth's surface using satellite observational data.
1 code implementation • 14 Mar 2024 • Yunfei Cheng, Aonan Zhang, Xuanyu Zhang, Chong Wang, Yi Wang
We present Recurrent Drafter (ReDrafter), an advanced speculative decoding approach that achieves state-of-the-art speedup for large language models (LLMs) inference.
no code implementations • 11 Mar 2024 • Yinyan Liu, Yi Wang, Jin Ma
Non-Intrusive Load Monitoring (NILM) is pivotal in today's energy landscape, offering vital solutions for energy conservation and efficient management.
2 code implementations • 11 Mar 2024 • Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, LiMin Wang, Yu Qiao
Addressing the dual challenges of local redundancy and global dependencies in video understanding, this work innovatively adapts the Mamba to the video domain.
Ranked #56 on
Action Classification
on Kinetics-400
1 code implementation • 5 Mar 2024 • Xin Chen, Hanxian Huang, Yanjun Gao, Yi Wang, Jishen Zhao, Ke Ding
Knowledge distillation, the technique of transferring knowledge from large, complex models to smaller ones, marks a pivotal step towards efficient AI deployment.
1 code implementation • 3 Mar 2024 • Chenying Liu, Conrad M Albrecht, Yi Wang, Qingyu Li, Xiao Xiang Zhu
AIO2 utilizes a mean teacher model to enhance training robustness with noisy labels to both stabilize the training accuracy curve for fitting in ACT and provide pseudo labels for correction in O2C.
no code implementations • 25 Feb 2024 • Chenying Liu, Conrad M Albrecht, Yi Wang, Xiao Xiang Zhu
Compared to supervised deep learning, self-supervision provides remote sensing a tool to reduce the amount of exact, human-crafted geospatial annotations.
1 code implementation • 25 Feb 2024 • Ruibin Yuan, Hanfeng Lin, Yi Wang, Zeyue Tian, Shangda Wu, Tianhao Shen, Ge Zhang, Yuhang Wu, Cong Liu, Ziya Zhou, Ziyang Ma, Liumeng Xue, Ziyu Wang, Qin Liu, Tianyu Zheng, Yizhi Li, Yinghao Ma, Yiming Liang, Xiaowei Chi, Ruibo Liu, Zili Wang, Pengfei Li, Jingcheng Wu, Chenghua Lin, Qifeng Liu, Tao Jiang, Wenhao Huang, Wenhu Chen, Emmanouil Benetos, Jie Fu, Gus Xia, Roger Dannenberg, Wei Xue, Shiyin Kang, Yike Guo
It is based on continual pre-training and finetuning LLaMA2 on a text-compatible music representation, ABC notation, and the music is treated as a second language.
1 code implementation • 14 Feb 2024 • Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zhou, Jianhua Zhou, Yi Wang
With the aim of effectively identifying prostate cancer, we propose a framework for the classification of clinically significant prostate cancer (csPCa) from multi-modality TRUS videos.
1 code implementation • 14 Feb 2024 • Zhuoyuan Wang, Haiqiao Wang, Yi Wang
The advent of deep-learning-based registration networks has addressed the time-consuming challenge in traditional iterative methods. However, the potential of current registration networks for comprehensively capturing spatial relationships has not been fully explored, leading to inadequate performance in large-deformation image registration. The pure convolutional neural networks (CNNs) neglect feature enhancement, while current Transformer-based networks are susceptible to information redundancy. To alleviate these issues, we propose a pyramid attention network (PAN) for deformable medical image registration. Specifically, the proposed PAN incorporates a dual-stream pyramid encoder with channel-wise attention to boost the feature representation. Moreover, a multi-head local attention Transformer is introduced as decoder to analyze motion patterns and generate deformation fields. Extensive experiments on two public brain magnetic resonance imaging (MRI) datasets and one abdominal MRI dataset demonstrate that our method achieves favorable registration performance, while outperforming several CNN-based and Transformer-based registration networks. Our code is publicly available at https://github. com/JuliusWang-7/PAN.
no code implementations • 8 Feb 2024 • Wei Wang, Huilong Ning, Gaowei Zhang, Libo Liu, Yi Wang
Our study thus provides first-hand insights into using ChatGPT to fulfill software engineering tasks with real-world developers and motivates the need for novel interaction mechanisms that help developers effectively work with large language models to achieve desired outcomes.
no code implementations • 2 Feb 2024 • Andrew Ye, James Xu, Vidyut Veedgav, Yi Wang, Yifan Yu, Daniel Yan, Ryan Chen, Vipin Chaudhary, Shuai Xu
We propose and study the integration of sentiment analysis and deep reinforcement learning ensemble algorithms for stock trading by evaluating strategies capable of dynamically altering their active agent given the concurrent market environment.
no code implementations • 26 Jan 2024 • Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, LiMin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He, Yingchun Wang, Yixu Wang, Yongting Zhang, Yu Qiao, Yujiong Shen, Yurong Mou, Yuxi Chen, Zaibin Zhang, Zhelun Shi, Zhenfei Yin, Zhipin Wang
Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents.
1 code implementation • 16 Jan 2024 • Zichuan Liu, Yingying Zhang, Tianchun Wang, Zefan Wang, Dongsheng Luo, Mengnan Du, Min Wu, Yi Wang, Chunlin Chen, Lunting Fan, Qingsong Wen
Explaining multivariate time series is a compound challenge, as it requires identifying important locations in the time series and matching complex temporal patterns.
no code implementations • 15 Jan 2024 • Zhitong Xiong, Yi Wang, Fahong Zhang, Xiao Xiang Zhu
Current remote sensing foundation models typically specialize in a single modality or a specific spatial resolution range, limiting their versatility for downstream datasets.
no code implementations • 10 Jan 2024 • Shubao Zhao, Ming Jin, Zhaoxiang Hou, Chengyi Yang, Zengxiang Li, Qingsong Wen, Yi Wang
Time series forecasting is crucial and challenging in the real world.
Ranked #18 on
Time Series Forecasting
on ETTh1 (336) Multivariate
no code implementations • 28 Dec 2023 • Hang Zhang, Thanh D. Nguyen, Jinwei Zhang, Renjiu Hu, Susan A. Gauthier, Yi Wang
We validated RimSet using simulated QSM images and an in vivo dataset of 172 MS subjects with 177 rim+ and 3986 rim-lesions.
1 code implementation • 28 Dec 2023 • Chenxi Wang, Pierre Pinson, Yi Wang
The relationship between (i) errors in both time and frequency domains and (ii) operational value of the forecasts is analysed.
no code implementations • 26 Dec 2023 • Madiha Fatima, Zhihua Cao, Aichun Huang, Shengyuan Wu, Xinxian Fan, Yi Wang, Liu Jiren, Ziyun Zhu, Qiongrou Ye, Yuan Ma, Joseph K. F Chow, Peng Jia, Yangshou Liu, Yubin Lin, Manjun Ye, Tong Wu, ZHIXUN LI, Cong Cai, Wenhai Zhang, Cheris H. Q. Ding, Yuanzhe Cai, Feijuan Huang
With the global spread and increasing transmission rate of SARS-CoV-2, more and more laboratories and researchers are turning their attention to wastewater-based epidemiology (WBE), hoping it can become an effective tool for large-scale testing and provide more ac-curate predictions of the number of infected individuals.
1 code implementation • 14 Dec 2023 • Mingyang Chen, Bo Huang, Junda Lu, Bing Li, Yi Wang, Minhao Cheng, Wei Wang
This ensures the memory efficiency of our method and provides a flexible tradeoff between time and memory budgets, allowing us to distil ImageNet-1K using a minimum of only 6. 5GB of GPU memory.
1 code implementation • NeurIPS 2023 • Rongkun Zheng, Lu Qi, Xi Chen, Yi Wang, Kun Wang, Yu Qiao, Hengshuang Zhao
What we possess are numerous isolated filed-specific datasets, thus, it is appealing to jointly train models across the aggregation of datasets to enhance data volume and diversity.
1 code implementation • 11 Dec 2023 • Yao Sun, Yi Wang, Michael Eineder
Quick and automated earthquake-damaged building detection from post-event satellite imagery is crucial, yet it is challenging due to the scarcity of training data required to develop robust algorithms.
no code implementations • 10 Dec 2023 • Yi Wang, Jian Ma, Ruizhi Shao, Qiao Feng, Yu-Kun Lai, Yebin Liu, Kun Li
To keep the generated clothing consistent with the target text, we propose a semantic-confidence strategy for clothing that can eliminate the non-clothing content generated by the model.
3 code implementations • CVPR 2024 • Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Yi Liu, Zun Wang, Jilan Xu, Guo Chen, Ping Luo, LiMin Wang, Yu Qiao
With the rapid development of Multi-modal Large Language Models (MLLMs), a number of diagnostic benchmarks have recently emerged to evaluate the comprehension capabilities of these models.
no code implementations • 17 Nov 2023 • Renjiu Hu, Qihao Zhang, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
The trained network was further tested in a synthetic brain ASL image based on vasculature network extracted from magnetic resonance (MR) angiography.
no code implementations • 16 Nov 2023 • Yangze Zhou, Qingsong Wen, Jie Song, Xueyuan Cui, Yi Wang
Accurate load forecasting serves as the foundation for the flexible operation of multi-energy systems (MES).
no code implementations • 6 Nov 2023 • Cheng Feng, Kedi Zheng, Yi Wang, Kaibin Huang, Qixin Chen
We formulate a bandwidth allocation problem aimed at maximizing the information utility gain of transmitted data brought to CPS operation goals.
1 code implementation • 30 Oct 2023 • Yizhuo Li, Kunchang Li, Yinan He, Yi Wang, Yali Wang, LiMin Wang, Yu Qiao, Ping Luo
Building video-language foundation models is costly and difficult due to the redundant nature of video data and the lack of high-quality video-language datasets.
1 code implementation • 28 Oct 2023 • Yi Wang, Hugo Hernández Hernández, Conrad M Albrecht, Xiao Xiang Zhu
Self-supervised learning guided by masked image modelling, such as Masked AutoEncoder (MAE), has attracted wide attention for pretraining vision transformers in remote sensing.
Ranked #1 on
Multi-Label Image Classification
on BigEarthNet-S1 (official test set)
(using extra training data)
no code implementations • 25 Oct 2023 • Ji Jiang, Meng Cao, Tengtao Song, Long Chen, Yi Wang, Yuexian Zou
Video Referring Expression Comprehension (REC) aims to localize a target object in videos based on the queried natural language.
6 code implementations • 16 Oct 2023 • Ming Jin, Qingsong Wen, Yuxuan Liang, Chaoli Zhang, Siqiao Xue, Xue Wang, James Zhang, Yi Wang, Haifeng Chen, XiaoLi Li, Shirui Pan, Vincent S. Tseng, Yu Zheng, Lei Chen, Hui Xiong
In this survey, we offer a comprehensive and up-to-date review of large models tailored (or adapted) for time series and spatio-temporal data, spanning four key facets: data types, model categories, model scopes, and application areas/tasks.
1 code implementation • 26 Sep 2023 • Yi Wang
We present a holistic approach for high resolution image classification that won second place in the ICCV/CVPPA2023 Deep Nutrient Deficiency Challenge.
2 code implementations • 26 Sep 2023 • Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu
To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model.
Ranked #4 on
Text-to-Video Generation
on EvalCrafter Text-to-Video (ECTV) Dataset
(using extra training data)
1 code implementation • 26 Sep 2023 • Yi Wang, Jieliang Luo, Adam Gaier, Evan Atherton, Hilmar Koch
We develop a method of generating datasets of facility layout tasks, create a gym-like environment for experimenting with and evaluating different methods, and further analyze the two methods with comprehensive experiments, aiming to provide insights for solving facility layout tasks.
1 code implementation • NeurIPS 2023 • Tianyi Liu, Kejun Wu, Yi Wang, Wenyang Liu, Kim-Hui Yap, Lap-Pui Chau
The past decade has witnessed great strides in video recovery by specialist technologies, like video inpainting, completion, and error concealment.
no code implementations • 19 Sep 2023 • Jianjun Gao, Yi Wang, Kim-Hui Yap, Kratika Garg, Boon Siew Han
Particularly, the improvements on IDF1, IDSw, AssA, and AssR demonstrate the effectiveness of our OccluTrack on tracking and association performance.
2 code implementations • 11 Sep 2023 • Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham, Chenying Liu, Zhitong Xiong, Xiao Xiang Zhu
The increasing availability of multi-sensor data sparks wide interest in multimodal self-supervised learning.
no code implementations • 5 Sep 2023 • Md Ferdous Alam, Yi Wang, Chin-Yi Cheng, Jieliang Luo
We develop the preference model by estimating the density of the learned representations whereas we train an autoregressive transformer model for sequential design generation.
no code implementations • 4 Sep 2023 • Cheng Feng, Linbin Huang, Xiuqiang He, Yi Wang, Florian Dörfler, Qixin Chen
To address this gap, this paper defines the joint oscillation damping and inertia provision services at the system level, seeking to encourage converter-interfaced generation to provide enhanced damping and fast frequency response capabilities.
1 code implementation • ICCV 2023 • Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen, Mukund Varma, Yi Wang, Zhangyang Wang
Cross-scene generalizable NeRF models, which can directly synthesize novel views of unseen scenes, have become a new spotlight of the NeRF field.
1 code implementation • 4 Aug 2023 • Yi Wang, Chenying Liu, Arti Tiwari, Micha Silver, Arnon Karnieli, Xiao Xiang Zhu, Conrad M Albrecht
Discovering ancient agricultural terraces in desert regions is important for the monitoring of long-term climate changes on the Earth's surface.
1 code implementation • ICCV 2023 • Zun Wang, Jialu Li, Yicong Hong, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao
Recent research in language-guided visual navigation has demonstrated a significant demand for the diversity of traversable environments and the quantity of supervision for training generalizable agents.
no code implementations • 21 Jul 2023 • Yihan Wang, Qiao Yan, Yi Wang
LiDAR-based 3D object detection is of paramount importance for autonomous driving.
1 code implementation • 14 Jul 2023 • Zhixian Wang, Qingsong Wen, Chaoli Zhang, Liang Sun, Leandro Von Krannichfeldt, Shirui Pan, Yi Wang
However, there are many differences between energy forecasting and traditional time series forecasting.
1 code implementation • 13 Jul 2023 • Yi Wang, Yinan He, Yizhuo Li, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Conghui He, Ping Luo, Ziwei Liu, Yali Wang, LiMin Wang, Yu Qiao
Specifically, we utilize a multi-scale approach to generate video-related descriptions.
1 code implementation • 29 Jun 2023 • Yuming Zhong, Yi Wang
The network first utilizes the pseudo-masks generated using the extreme points to train itself, by minimizing a contrastive loss, which encourages the network to learn more representative features for cancerous voxels.
no code implementations • 27 Jun 2023 • Tianzi Wang, Shoukang Hu, Jiajun Deng, Zengrui Jin, Mengzhe Geng, Yi Wang, Helen Meng, Xunying Liu
Automatic recognition of disordered and elderly speech remains highly challenging tasks to date due to data scarcity.
no code implementations • 19 Jun 2023 • Shivam Pande, Nassim Ait Ali Braham, Yi Wang, Conrad M Albrecht, Biplab Banerjee, Xiao Xiang Zhu
Recently, to effectively train the deep learning models with minimal labelled samples, the unlabeled samples are also being leveraged in self-supervised and semi-supervised setting.
2 code implementations • NeurIPS 2023 • Adam J. Stewart, Nils Lehmann, Isaac A. Corley, Yi Wang, Yi-Chia Chang, Nassim Ait Ali Braham, Shradha Sehgal, Caleb Robinson, Arindam Banerjee
The Landsat program is the longest-running Earth observation program in history, with 50+ years of data acquisition by 8 satellites.
no code implementations • 15 Jun 2023 • Junting Pan, Ziyi Lin, Yuying Ge, Xiatian Zhu, Renrui Zhang, Yi Wang, Yu Qiao, Hongsheng Li
Video Question Answering (VideoQA) has been significantly advanced from the scaling of recent Large Language Models (LLMs).
Ranked #3 on
Temporal/Casual QA
on NExT-QA
(using extra training data)
no code implementations • 14 Jun 2023 • Hengbo Liu, Ziqing Ma, Linxiao Yang, Tian Zhou, Rui Xia, Yi Wang, Qingsong Wen, Liang Sun
In this paper, we propose a novel forecasting framework, named Self-adaptive Decomposed Interpretable framework~(SaDI), which ensembles long-term trend, short-term trend, and period modelings to capture temporal characteristics in different components.
1 code implementation • 13 Jun 2023 • Chen Cai, Suchen Wang, Kim-Hui Yap, Yi Wang
Weakly-supervised grounded image captioning (WSGIC) aims to generate the caption and ground (localize) predicted object words in the input image without using bounding box supervision.
1 code implementation • 9 Jun 2023 • Haiqiao Wang, Dong Ni, Yi Wang
The Transformer structures have been widely used in computer vision and have recently made an impact in the area of medical image registration.
1 code implementation • 31 May 2023 • Zhixian Wang, Qingsong Wen, Chaoli Zhang, Liang Sun, Yi Wang
The uncertainties in load forecasting can be divided into two types: epistemic uncertainty and aleatoric uncertainty.
1 code implementation • 24 May 2023 • Zhitong Xiong, Sining Chen, Yi Wang, Lichao Mou, Xiao Xiang Zhu
Towards a fair and comprehensive analysis of existing methods, the proposed benchmark consists of 1) a large-scale dataset including co-registered RGB and nDSM pairs and pixel-wise semantic labels; 2) a comprehensive evaluation and analysis of existing multi-modal fusion strategies for both convolutional and Transformer-based networks on remote sensing data.
Ranked #1 on
Semantic Segmentation
on GAMUS
1 code implementation • 22 May 2023 • Guo Chen, Yin-Dong Zheng, Jiahao Wang, Jilan Xu, Yifei HUANG, Junting Pan, Yi Wang, Yali Wang, Yu Qiao, Tong Lu, LiMin Wang
Building upon this insight, we propose a novel framework called VideoLLM that leverages the sequence reasoning capabilities of pre-trained LLMs from natural language processing (NLP) for video sequence understanding.
1 code implementation • 10 May 2023 • Kunchang Li, Yinan He, Yi Wang, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, LiMin Wang, Yu Qiao
In this paper, we initiate an attempt of developing an end-to-end chat-centric video understanding system, coined as VideoChat.
Ranked #3 on
Question Answering
on NExT-QA (Open-ended VideoQA)
Video-based Generative Performance Benchmarking (Consistency)
Video-based Generative Performance Benchmarking (Contextual Understanding)
+5
2 code implementations • 9 May 2023 • Zhaoyang Liu, Yinan He, Wenhai Wang, Weiyun Wang, Yi Wang, Shoufa Chen, Qinglong Zhang, Zeqiang Lai, Yang Yang, Qingyun Li, Jiashuo Yu, Kunchang Li, Zhe Chen, Xue Yang, Xizhou Zhu, Yali Wang, LiMin Wang, Ping Luo, Jifeng Dai, Yu Qiao
Different from existing interactive systems that rely on pure language, by incorporating pointing instructions, the proposed iGPT significantly improves the efficiency of communication between users and chatbots, as well as the accuracy of chatbots in vision-centric tasks, especially in complicated visual scenarios where the number of objects is greater than 2.
no code implementations • 5 May 2023 • Jinwei Zhang, Alexey Dimov, Chao Li, Hang Zhang, Thanh D. Nguyen, Pascal Spincemaille, Yi Wang
Purpose: To improve the generalization ability of convolutional neural network (CNN) based prediction of quantitative susceptibility mapping (QSM) from high-pass filtered phase (HPFP) image.
1 code implementation • 26 Apr 2023 • Ruizhe Zheng, Jun Li, Yi Wang, Tian Luo, Yuguo Yu
Patient-independent detection of epileptic activities based on visual spectral representation of continuous EEG (cEEG) has been widely used for diagnosing epilepsy.
no code implementations • 24 Apr 2023 • Pengcheng Ai, Le Xiao, Zhi Deng, Yi Wang, Xiangming Sun, Guangming Huang, Dong Wang, Yulei Li, Xinchi Ran
We mathematically demonstrate the existence of the optimal function desired by the method, and give a systematic algorithm for training and calibration of the model.
no code implementations • 22 Apr 2023 • Gong Chen, Yanan Zhao, Yi Wang, Kim-Hui Yap
Recently, synthetic aperture radar (SAR) image change detection has become an interesting yet challenging direction due to the presence of speckle noise.
1 code implementation • 22 Apr 2023 • Alexandra G. Roberts, Dominick J. Romano, Mert Şişman, Alexey V. Dimov, Pascal Spincemaille, Thanh D. Nguyen, Ilhami Kovanlikaya, Susan A. Gauthier, Yi Wang
To develop a tissue field filtering algorithm, called maximum Spherical Mean Value (mSMV), for reducing shadow artifacts in quantitative susceptibility mapping (QSM) of the brain without requiring brain tissue erosion. Residual background field is a major source of shadow artifacts in QSM.
1 code implementation • CVPR 2023 • Wenyang Liu, Yi Wang, Kim-Hui Yap, Lap-Pui Chau
In this paper, we study a real-world JPEG image restoration problem with bit errors on the encrypted bitstream.
1 code implementation • 14 Apr 2023 • Wenyang Liu, Yi Wang, Kejun Wu, Kim-Hui Yap, Lap-Pui Chau
File fragment classification (FFC) on small chunks of memory is essential in memory forensics and Internet security.
no code implementations • 7 Apr 2023 • Jinwei Zhang, Thanh D. Nguyen, Eddy Solomon, Chao Li, Qihao Zhang, Jiahao Li, Hang Zhang, Pascal Spincemaille, Yi Wang
Results: The retrospective ablation study showed improved image sharpness of mcLARO compared to the baseline network without multi-contrast sampling pattern optimization or image feature fusion, and negligible bias and narrow 95% limits of agreement on regional T1, T2, T2* and QSM values were obtained by the under-sampled reconstructions compared to the fully sampled reconstruction.
no code implementations • 7 Apr 2023 • Gongning Luo, Kuanquan Wang, Jun Liu, Shuo Li, Xinjie Liang, Xiangyu Li, Shaowei Gan, Wei Wang, Suyu Dong, Wenyi Wang, Pengxin Yu, Enyou Liu, Hongrong Wei, Na Wang, Jia Guo, Huiqi Li, Zhao Zhang, Ziwei Zhao, Na Gao, Nan An, Ashkan Pakzad, Bojidar Rangelov, Jiaqi Dou, Song Tian, Zeyu Liu, Yi Wang, Ampatishan Sivalingam, Kumaradevan Punithakumar, Zhaowen Qiu, Xin Gao
Efficient automatic segmentation of multi-level (i. e. main and branch) pulmonary arteries (PA) in CTPA images plays a significant role in clinical applications.
1 code implementation • CVPR 2023 • LiMin Wang, Bingkun Huang, Zhiyu Zhao, Zhan Tong, Yinan He, Yi Wang, Yali Wang, Yu Qiao
Finally, we successfully train a video ViT model with a billion parameters, which achieves a new state-of-the-art performance on the datasets of Kinetics (90. 0% on K400 and 89. 9% on K600) and Something-Something (68. 7% on V1 and 77. 0% on V2).
Ranked #1 on
Self-Supervised Action Recognition
on UCF101
(using extra training data)
1 code implementation • ICCV 2023 • Kunchang Li, Yali Wang, Yizhuo Li, Yi Wang, Yinan He, LiMin Wang, Yu Qiao
Previous VFMs rely on Image Foundation Models (IFMs), which face challenges in transferring to the video domain.
Ranked #1 on
Video Retrieval
on SSv2-template retrieval
(using extra training data)
1 code implementation • CVPR 2023 • Kun Zhou, Wenbo Li, Yi Wang, Tao Hu, Nianjuan Jiang, Xiaoguang Han, Jiangbo Lu
Neural radiance fields (NeRF) show great success in novel view synthesis.
Ranked #1 on
Novel View Synthesis
on LLFF
no code implementations • 12 Mar 2023 • Yi Wang, Jiaze Wang, Jinpeng Li, Zixu Zhao, Guangyong Chen, Anfeng Liu, Pheng-Ann Heng
With Point-MAE as our baseline, our model surpasses previous methods by a significant margin, achieving 86. 3% accuracy on ScanObjectNN and 94. 1% accuracy on ModelNet40.
no code implementations • 28 Feb 2023 • Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng
Experiments conducted on the UASpeech dysarthric and DementiaBank Pitt elderly speech corpora suggest TDNN and Conformer ASR systems integrated domain adapted wav2vec2. 0 models consistently outperform the standalone wav2vec2. 0 models by statistically significant WER reductions of 8. 22% and 3. 43% absolute (26. 71% and 15. 88% relative) on the two tasks respectively.
1 code implementation • 2 Feb 2023 • Minglun Han, Qingyu Wang, Tielin Zhang, Yi Wang, Duzhen Zhang, Bo Xu
The spiking neural network (SNN) using leaky-integrated-and-fire (LIF) neurons has been commonly used in automatic speech recognition (ASR) tasks.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 25 Jan 2023 • Chengqian Ma, Zhiqiang Wu, Chunlei Cai, Pengwei Zhang, Yi Wang, Long Zheng, Chao Chen, Quan Zhou
In the past decades, lots of progress have been done in the video compression field including traditional video codec and learning-based video codec.
1 code implementation • CVPR 2023 • Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng, Yi Wang, Yu Qiao, Weidi Xie
The former aims to infer all masked entities in the caption given the group tokens, that enables the model to learn fine-grained alignment between visual groups and text entities.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
+1
no code implementations • ICCV 2023 • Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, LiMin Wang, Yu Qiao
The prolific performances of Vision Transformers (ViTs) in image tasks have prompted research into adapting the image ViTs for video tasks.
1 code implementation • CVPR 2023 • Bo Huang, Mingyang Chen, Yi Wang, Junda Lu, Minhao Cheng, Wei Wang
Thus, recent studies concern about adversarial distillation (AD) that aims to inherit not only prediction accuracy but also adversarial robustness of a robust teacher model under the paradigm of robust optimization.
no code implementations • CVPR 2023 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang
In this work, we study the challenging task of lifting a single image to a 3D object and, for the first time, demonstrate the ability to generate a plausible 3D object with 360deg views that corresponds well with the given reference image.
1 code implementation • CVPR 2023 • Yi Wang, Ruili Wang, Xin Fan, Tianzhu Wang, Xiangjian He
A multi-level hybrid loss is firstly designed to guide the network to learn pixel-level, region-level, and object-level features.
no code implementations • 26 Dec 2022 • Xinyi Wang, Jianteng Peng, Sufang Zhang, Bihui Chen, Yi Wang, Yandong Guo
Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks.
2 code implementations • 6 Dec 2022 • Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, LiMin Wang, Yu Qiao
Specifically, InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives, and selectively coordinates video representations of these two complementary frameworks in a learnable manner to boost various video applications.
Ranked #1 on
Action Recognition
on Something-Something V1
(using extra training data)
1 code implementation • 1 Dec 2022 • Ruibin Yuan, Hanzhi Yin, Yi Wang, Yifan He, Yushi Ye, Lei Zhang, Zhizheng Wu
We apply this technique to the automatic speaker verification (ASV) task as a proof of concept.
1 code implementation • 29 Nov 2022 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang
In this work, we study the challenging task of lifting a single image to a 3D object and, for the first time, demonstrate the ability to generate a plausible 3D object with 360{\deg} views that correspond well with the given reference image.
no code implementations • 26 Nov 2022 • Junlin Hou, Jilan Xu, Nan Zhang, Yi Wang, Yuejie Zhang, Xiaobo Zhang, Rui Feng
This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop at the European Conference on Computer Vision (ECCV 2022).
no code implementations • 26 Nov 2022 • Chandrajit Bajaj, Omatharv Bharat Vaidya, Yi Wang
Task learning in neural networks typically requires finding a globally optimal minimizer to a loss function objective.
no code implementations • 19 Nov 2022 • Xinwei Xue, Gaoyu Wang, Long Ma, Qi Jia, Yi Wang
In this paper, we design an adjacent slice feature fusion model to introduce information from adjacent slices.
3 code implementations • 17 Nov 2022 • Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, LiMin Wang, Yu Qiao
UniFormer has successfully alleviated this issue, by unifying convolution and self-attention as a relation aggregator in the transformer format.
2 code implementations • 17 Nov 2022 • Guo Chen, Sen Xing, Zhe Chen, Yi Wang, Kunchang Li, Yizhuo Li, Yi Liu, Jiahao Wang, Yin-Dong Zheng, Bingkun Huang, Zhiyu Zhao, Junting Pan, Yifei HUANG, Zun Wang, Jiashuo Yu, Yinan He, Hongjie Zhang, Tong Lu, Yali Wang, LiMin Wang, Yu Qiao
In this report, we present our champion solutions to five tracks at Ego4D challenge.
Ranked #1 on
State Change Object Detection
on Ego4D
4 code implementations • 13 Nov 2022 • Yi Wang, Nassim Ait Ali Braham, Zhitong Xiong, Chenying Liu, Conrad M Albrecht, Xiao Xiang Zhu
Self-supervised pre-training bears potential to generate expressive representations without human annotation.
Ranked #1 on
Multi-Label Image Classification
on BigEarthNet (official test set)
(using extra training data)
1 code implementation • 1 Nov 2022 • Jinwei Zhang, Pascal Spincemaille, Hang Zhang, Thanh D. Nguyen, Chao Li, Jiahao Li, Ilhami Kovanlikaya, Mert R. Sabuncu, Yi Wang
In this paper, we present our new framework, called Learned Acquisition and Reconstruction Optimization (LARO), which aims to accelerate the multi-echo gradient echo (mGRE) pulse sequence for QSM.
1 code implementation • 29 Oct 2022 • Yi Wang, Jiajun Deng, Tianzi Wang, Bo Zheng, Shoukang Hu, Xunying Liu, Helen Meng
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression.
1 code implementation • 20 Oct 2022 • Yi Wang, Menghan Xia, Lu Qi, Jing Shao, Yu Qiao
Multimodal ambiguity and color bleeding remain challenging in colorization.
1 code implementation • 20 Oct 2022 • Zefan Yang, Di Lin, Dong Ni, Yi Wang
To address these issues, we propose a non-iterative method where a stream of varying (pacing) pseudo-masks teach a network via consistency training, named PacingPseudo.
no code implementations • 10 Oct 2022 • Zhitong Xiong, Fahong Zhang, Yi Wang, Yilei Shi, Xiao Xiang Zhu
Furthermore, a new platform for EO, termed EarthNets, is released to achieve a fair and consistent evaluation of deep learning methods on remote sensing data.
2 code implementations • 15 Sep 2022 • Yi Wang, Zhiwen Fan, Tianlong Chen, Hehe Fan, Zhangyang Wang
Vision Transformers (ViTs) have proven to be effective, in solving 2D image understanding tasks by training over large-scale image datasets; and meanwhile as a somehow separate track, in modeling the 3D visual world too such as voxels or point clouds.
no code implementations • 8 Sep 2022 • Zeyu Liu, Yi Wang, Jing Wen, Yong Zhang, Hao Yin, Chao Guo, Zhongyu Wang
In addition, in order to improve the segmentation performance, we adopt multi-view and multi-window level method, at the same time we employ a fine-tune strategy to mitigate the impact of inconsistent labeling.
no code implementations • 2 Sep 2022 • Pengcheng Ai, Zhi Deng, Yi Wang, Hui Gong, Xinchi Ran, Zijian Lang
Recent literature reveals that deep learning models, especially one-dimensional convolutional neural networks, are promising when dealing with digital signals from nuclear detectors.
no code implementations • 23 Aug 2022 • Chunlei Cai, Yi Wang, Xiaobo Li, Tianxiao Ye
With the help of first pass predicted RF and corresponding actual quality as feedback, the second pass prediction will be highly accurate.
no code implementations • 28 Jun 2022 • Yi Wang, Tianzi Wang, Zi Ye, Lingwei Meng, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and delay progression.
3 code implementations • 27 Jun 2022 • Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham, Lichao Mou, Xiao Xiang Zhu
In deep learning research, self-supervised learning (SSL) has received great attention triggering interest within both the computer vision and remote sensing communities.
Ranked #3 on
Multi-Label Image Classification
on BigEarthNet
1 code implementation • 23 Jun 2022 • Dong An, Zun Wang, Yangguang Li, Yi Wang, Yicong Hong, Yan Huang, Liang Wang, Jing Shao
Our model consists of three modules: the candidate waypoints predictor (CWP), the history enhanced planner and the tryout controller.
no code implementations • 23 Jun 2022 • Tianzi Wang, Jiajun Deng, Mengzhe Geng, Zi Ye, Shoukang Hu, Yi Wang, Mingyu Cui, Zengrui Jin, Xunying Liu, Helen Meng
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care to delay further progression.
no code implementations • 20 Jun 2022 • Yi Wang, Yi Si
Recently, GAN-based neural vocoders such as Parallel WaveGAN, MelGAN, HiFiGAN, and UnivNet have become popular due to their lightweight and parallel structure, resulting in a real-time synthesized waveform with high fidelity, even on a CPU.
no code implementations • 14 Jun 2022 • Conrad M Albrecht, Chenying Liu, Yi Wang, Levente Klein, Xiao Xiang Zhu
We present and evaluate a weakly-supervised methodology to quantify the spatio-temporal distribution of urban forests based on remotely sensed data with close-to-zero human interaction.
no code implementations • 24 May 2022 • Jialiang Wang, Haotian Wei, Yi Wang, Shu Yang, Chi Li
Human activity recognition (HAR) based on multimodal sensors has become a rapidly growing branch of biometric recognition and artificial intelligence.
no code implementations • 20 May 2022 • Yi Wang, Zhiqing Wei, Zhiyong Feng
This article provides an overview of the beam training and tracking technologies on mmWave bands and reveals the insights for future research in the 6th Generation (6G) mobile network.
no code implementations • 25 Apr 2022 • Akos Lada, Xiaoxuan Liu, Jens Rischbieth, Yi Wang, Yuwen Zhang
Content recommender systems are generally adept at maximizing immediate user satisfaction but to optimize for the \textit{long-run} user value, we need more statistically sophisticated solutions than off-the-shelf simple recommender algorithms.
2 code implementations • 11 Apr 2022 • Yi Wang, Conrad M Albrecht, Xiao Xiang Zhu
Experimental results employing the BigEarthNet-MM dataset demonstrate the benefits of both, the ViT backbones and the proposed multimodal SSL algorithm DINO-MM.
no code implementations • 1 Apr 2022 • Miha Grabner, Yi Wang, Qingsong Wen, Boštjan Blažič, Vitomir Štruc
Efficient load forecasting is needed to ensure better observability in the distribution networks, whereas such forecasting is made possible by an increasing number of smart meter installations.
1 code implementation • CVPR 2022 • Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia
Recent studies have shown the importance of modeling long-range interactions in the inpainting problem.
Ranked #2 on
Image Inpainting
on CelebA-HQ