no code implementations • 2 Dec 2024 • Hao Li, Xiang Chen, Jiangxin Dong, Jinhui Tang, Jinshan Pan
Despite the significant progress made by all-in-one models in universal image restoration, existing methods suffer from a generalization bottleneck in real-world scenarios, as they are mostly trained on small-scale synthetic datasets with limited degradations.
no code implementations • 29 Nov 2024 • Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao
To explain such a counter-intuitive phenomenon, we discover a visual safety information leakage (VSIL) problem in existing multimodal safety benchmarks, i. e., the potentially risky and sensitive content in the image has been revealed in the textual query.
no code implementations • 28 Nov 2024 • Peilin Tian, Hao Li
Specifically, we propose a solution of visual SLAMMOT considering multiple motion models and validate the inherent advantages of IMM-SLAMMOT in the visual domain.
no code implementations • 19 Nov 2024 • Hao Li, Yuanyuan Gao, Haosong Peng, Chenming Wu, Weicai Ye, Yufeng Zhan, Chen Zhao, Dingwen Zhang, Jingdong Wang, Junwei Han
This paper presents DGTR, a novel distributed framework for efficient Gaussian reconstruction for sparse-view vast scenes.
2 code implementations • 15 Nov 2024 • Guowei Xu, Peng Jin, Hao Li, Yibing Song, Lichao Sun, Li Yuan
Large language models have demonstrated substantial advancements in reasoning capabilities, particularly through inference-time scaling, as illustrated by models such as OpenAI's o1.
no code implementations • 13 Nov 2024 • Qiang Zhou, Shaofeng Zhang, Nianzu Yang, Ye Qian, Hao Li
Furthermore, MVideo supports motion condition editing and composition, facilitating the generation of videos with more complex actions.
no code implementations • 7 Nov 2024 • Chengxin Hu, Hao Li
We then explore the effect of various feature levels on performance, finding that both the quality of LLM-generated molecules and performance on different tasks benefit from different feature levels.
1 code implementation • 30 Oct 2024 • Hao Li, Xiaogeng Liu
NotInject contains 339 benign samples enriched with trigger words common in prompt injection attacks, enabling fine-grained evaluation.
no code implementations • 28 Oct 2024 • Jiacheng Wang, Xiang Chen, Renjiu Hu, Rongguang Wang, Min Liu, Yaonan Wang, Jiazheng Wang, Hao Li, Hang Zhang
Co-examination of second-harmonic generation (SHG) and bright-field (BF) microscopy enables the differentiation of tissue components and collagen fibers, aiding the analysis of human breast and pancreatic cancer tissues.
no code implementations • 26 Oct 2024 • Yue Su, Hao Li, Maoguo Gong
This black-box, self-supervised approach ensures the generalizability of our attack against various VI-ReID models.
1 code implementation • 24 Oct 2024 • Hawau Olamide Toyin, Hao Li, Hanan Aldarmaki
Speech recognition and speech synthesis models are typically trained separately, each with its own set of learning objectives, training data, and model parameters, resulting in two distinct large networks.
no code implementations • 19 Oct 2024 • Yuzhu Li, Hao Li, WeiJie Chen, Keelan O'Riordan, Neha Mani, Yuxuan Qi, Tairan Liu, Sridhar Mani, Aydogan Ozcan
It blindly achieved a sensitivity of 97. 92% and a specificity of 96. 77% for DB10, and a sensitivity of 100% and a specificity of 97. 22% for H6.
1 code implementation • 17 Oct 2024 • Rongyao Fang, Chengqi Duan, Kun Wang, Hao Li, Hao Tian, Xingyu Zeng, Rui Zhao, Jifeng Dai, Hongsheng Li, Xihui Liu
This work represents a significant step towards a truly unified MLLM capable of adapting to the granularity demands of various visual tasks.
no code implementations • 17 Oct 2024 • Hao Li, Chengyi Xing, Saad Khan, Miaoya Zhong, Mark R. Cutkosky
Aquatic mammals, such as pinnipeds, utilize their whiskers to detect and discriminate objects and analyze water movements, inspiring the development of robotic whiskers for sensing contacts, surfaces, and water flows.
1 code implementation • 14 Oct 2024 • Qibing Ren, Hao Li, Dongrui Liu, Zhanxu Xie, Xiaoya Lu, Yu Qiao, Lei Sha, Junchi Yan, Lizhuang Ma, Jing Shao
This study exposes the safety vulnerabilities of Large Language Models (LLMs) in multi-turn interactions, where malicious users can obscure harmful intents across several queries.
1 code implementation • 11 Oct 2024 • Hao Li, Cor-Paul Bezemer, Ahmed E. Hassan
Our study not only enriches the body of knowledge on practical applications of FM4SE and SE4FM but also demonstrates the utility of FMs as a powerful and efficient approach in conducting literature surveys within technical and grey literature domains.
no code implementations • 11 Oct 2024 • De-Xing Huang, Xiao-Hu Zhou, Mei-Jiang Gui, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Hao Li, Tian-Yu Xiang, Zeng-Guang Hou
Iodinated contrast agents are widely utilized in numerous interventional procedures, yet posing substantial health risks to patients.
no code implementations • 10 Oct 2024 • Qian Liu, Bing Gong, Xiaoran Zhuang, Xiaohui Zhong, Zhiming Kang, Hao Li
The rapid advancement of artificial intelligence (AI) in weather research has been driven by the ability to learn from large, high-dimensional datasets.
no code implementations • 8 Oct 2024 • Yunfei Yang, Zhenghao Qi, Honghuan Wu, Qi Song, Tieyao Zhang, Hao Li, Yimin Tu, Kaiqiao Zhan, Ben Wang
Specifically, we utilize a gate model to identify videos that may have playback issues in real-time, and then we employ a ranking model to select the optimal result from a locally-cached pool to replace the stuttering videos.
no code implementations • 7 Oct 2024 • Yongming Chen, Miner Chen, Ye Zhu, Juan Pei, Siyu Chen, Yu Zhou, Yi Wang, Yifan Zhou, Hao Li, Songan Zhang
To address this issue, we propose an efficient law article recommendation approach utilizing a Knowledge Graph (KG) and a Large Language Model (LLM).
1 code implementation • 6 Oct 2024 • Dewei Hu, Hao Li, Han Liu, Jiacheng Wang, Xing Yao, Daiwei Lu, Ipek Oguz
Subsequently, we sample on the target domain with binary vessel masks from the source domain to get paired data, i. e., target domain synthetic images conditioned on the binary vessel map.
1 code implementation • 2 Oct 2024 • Hao Li, Jiayang Gu, Jingkuan Song, An Zhang, Lianli Gao
Mitigating the detrimental effects of noisy labels on the training process has become increasingly critical, as obtaining entirely clean or human-annotated samples for large-scale pre-training tasks is often impractical.
no code implementations • 30 Sep 2024 • Yabing Wang, Le Wang, Qiang Zhou, Zhibin Wang, Hao Li, Gang Hua, Wei Tang
Then, we take these semantic slots as internal features and leverage them to interact with the visual features.
no code implementations • 28 Sep 2024 • Ziyu Wang, Hao Li, Di Huang, Amir M. Rahmani
In digital healthcare, large language models (LLMs) have primarily been utilized to enhance question-answering capabilities and improve patient interactions.
1 code implementation • 26 Sep 2024 • Jiaming Liu, Linghe Kong, Yue Wu, Maoguo Gong, Hao Li, Qiguang Miao, Wenping Ma, Can Qin
Existing 3D mask learning methods encounter performance bottlenecks under limited data, and our objective is to overcome this limitation.
no code implementations • 23 Sep 2024 • Donglin Di, He Feng, Wenzhang Sun, Yongjia Ma, Hao Li, Wei Chen, Xiaofei Gou, Tonghua Su, Xun Yang
We obtain the corresponding performance benchmarks and compared them with those trained on public datasets to demonstrate the superiority of our dataset.
no code implementations • 18 Sep 2024 • Yuping Wu, Hao Li, Hongbo Zhu, Goran Nenadic, Xiao-jun Zeng
In this paper, we first introduce a parameter-free highlight method into the encoder-decoder framework: replacing the encoder attention mask with a saliency mask in the cross-attention module to force the decoder to focus only on salient parts of the input.
1 code implementation • 11 Sep 2024 • Xiaohui Zhong, Lei Chen, Xu Fan, Wenxu Qian, Jun Liu, Hao Li
The results demonstrate that FuXi-2. 0 consistently outperforms ECMWF HRES in forecasting key meteorological variables relevant to these sectors.
no code implementations • 10 Sep 2024 • Hao Li, Dong Liang, Zheng Xie
Meta-learning is characterized by its ability to learn how to learn, enabling the adaptation of learning strategies across different tasks.
no code implementations • 2 Sep 2024 • Ke Chang, Hao Li, Junzhao Zhang, Yunfang Wu
Metaphor and sarcasm are common figurative expressions in people's communication, especially on the Internet or the memes popular among teenagers.
no code implementations • 29 Aug 2024 • Xiangchen Yin, Donglin Di, Lei Fan, Hao Li, Chen Wei, Xiaofei Gou, Yang song, Xiao Sun, Xun Yang
Recent methods using diffusion models have made significant progress in human image generation with various additional controls such as pose priors.
no code implementations • 26 Aug 2024 • Hao Li, Jiangxin Dong, Jinshan Pan
However, the key components in recurrent-based VSR networks significantly impact model efficiency, e. g., the alignment module occupies a substantial portion of model parameters, while the bidirectional propagation mechanism significantly amplifies the inference time.
no code implementations • 22 Aug 2024 • Aishik Nagar, Viktor Schlegel, Thanh-Tung Nguyen, Hao Li, Yuping Wu, Kuluhan Binici, Stefan Winkler
Large Language Models (LLMs) are increasingly adopted for applications in healthcare, reaching the performance of domain experts on tasks such as question answering and document summarisation.
1 code implementation • 17 Aug 2024 • Hao Li, Hao Jiang, Jiajun Fan, Dongsheng Ye, Liang Du
This paper introduces the Dynamic Neural Dowker Network (DNDN), a novel framework specifically designed to approximate the results of dynamic Dowker filtration, aiming to capture the high-order topological features of dynamic directed graphs.
1 code implementation • 13 Aug 2024 • Hao Li, Fabian Deuser, Wenping Yina, Xuanshu Luo, Paul Walther, Gengchen Mai, Wei Huang, Martin Werner
The second information is geolocation awareness, which means how people whereabouts are made available.
no code implementations • 10 Aug 2024 • Ziyi Gao, Kai Chen, Zhipeng Wei, Tingshu Mou, Jingjing Chen, Zhiyu Tan, Hao Li, Yu-Gang Jiang
However, existing works on diffusion-based unrestricted attacks are mostly focused on images yet are seldom explored in videos.
1 code implementation • 10 Aug 2024 • Xiuyu Sun, Xiaohui Zhong, Xiaoze Xu, Yuanqing Huang, Hao Li, J. David Neelin, Deliang Chen, Jie Feng, Wei Han, Libo Wu, Yuan Qi
Weather forecasting traditionally relies on numerical weather prediction (NWP) systems that integrates global observational systems, data assimilation (DA), and forecasting models.
2 code implementations • 9 Aug 2024 • Hao Li, Baris Oguz, Gabriel Arenas, Xing Yao, Jiacheng Wang, Alison Pouch, Brett Byram, Nadav Schwartz, Ipek Oguz
The proposed model adopts the segmentation from our fully automated model for initialization and is designed in a human-in-the-loop manner to achieve iterative improvements.
no code implementations • 8 Aug 2024 • Hao Li, Han Liu, Heinrich von Busch, Robert Grimm, Henkjan Huisman, Angela Tong, David Winkel, Tobias Penzkofer, Ivan Shabunin, Moon Hyung Choi, Qingsong Yang, Dieter Szolar, Steven Shea, Fergus Coakley, Mukesh Harisinghani, Ipek Oguz, Dorin Comaniciu, Ali Kamen, Bin Lou
This method translates diffusion-weighted imaging (DWI) acquisitions, including apparent diffusion coefficient (ADC) and individual DW images acquired using various b-values, to align with the style of images acquired using b-values recommended by Prostate Imaging Reporting and Data System (PI-RADS) guidelines.
no code implementations • 5 Aug 2024 • Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Hao Li
Produced through a coarse-to-fine curation strategy, this dataset guarantees high-quality videos and detailed captions with excellent temporal consistency.
no code implementations • 4 Aug 2024 • Zheng Li, Siyuan Wu, Ruichuan Chen, Paarijaat Aditya, Istemi Ekin Akkus, Manohar Vanga, Min Zhang, Hao Li, Yang Zhang
Machine learning (ML), driven by prominent paradigms such as centralized and federated learning, has made significant progress in various critical applications ranging from autonomous driving to face recognition.
no code implementations • 25 Jul 2024 • Qing Zhao, Chengkui Zhang, Hao Li, Ting Ke
So, traditionally, it has to rely on a low w value in the later stage to force these algorithms to converge, but also makes them quickly lose their search ability and prone to getting trapped in local optima.
no code implementations • 25 Jul 2024 • Jiacheng Wang, Hao Li, Dewei Hu, Rui Xu, Xing Yao, Yuankai K. Tao, Ipek Oguz
We propose a novel framework for retinal feature point alignment, designed for learning cross-modality features to enhance matching and registration across multi-modality retinal images.
1 code implementation • 21 Jul 2024 • Hao Li, Zheng Li, Siyuan Wu, Chengrui Hu, Yutong Ye, Min Zhang, Dengguo Feng, Yang Zhang
Building upon this signal, we introduce a novel attack method called Sequential-metric based Membership Inference Attack (SeqMIA).
no code implementations • 15 Jul 2024 • Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Runyi Yu, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen
Specifically, we provide an automated method for reference local action sampling and leverage graph attention networks to assess the guiding weight of each local action in the overall motion synthesis.
1 code implementation • 11 Jul 2024 • Wanling Gao, Yunyou Huang, Dandan Cui, Zhuoming Yu, Wenjing Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Gangyuan Zhao, Chongrong Jiang, Fan Huang, Tianyi Wei, Suqin Tang, Bingjie Xia, Zhifei Zhang, Jianfeng Zhan
We envision DC-AI RCTs and VC-MedAI as pivotal advancements, presenting innovative and transformative evaluation methodologies for AI models in clinical practice, offering a preclinical-like setting mirroring conventional medicine, and reshaping development paradigms in a cost-effective and fast-iterative manner.
2 code implementations • 10 Jul 2024 • Hao Li, Baris Oguz, Gabriel Arenas, Xing Yao, Jiacheng Wang, Alison Pouch, Brett Byram, Nadav Schwartz, Ipek Oguz
These models produce a segmentation from visual prompts provided to indicate the target region, which may offer a feasible solution for practical use.
1 code implementation • 9 Jul 2024 • Hao Li, Shonosuke Sugasawa, Shota Katayama
While K-means is known to be a standard clustering algorithm, its performance may be compromised due to the presence of outliers and high-dimensional noisy variables.
1 code implementation • 9 Jul 2024 • Jiankun Li, Hao Li, JiangJiang Liu, Zhikang Zou, Xiaoqing Ye, Fan Wang, Jizhou Huang, Hua Wu, Haifeng Wang
Deep learning-based models are widely deployed in autonomous driving areas, especially the increasingly noticed end-to-end solutions.
1 code implementation • 7 Jul 2024 • Hao Li, Gopi Krishnan Rajbahadur, Cor-Paul Bezemer
Our study is the first to show that using a non-default binding can help improve machine learning software quality from the time cost perspective compared to the default Python binding while still achieving the same level of correctness.
1 code implementation • 30 Jun 2024 • Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Juming Xiong, Shunxing Bao, Hao Li, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo
Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy.
1 code implementation • 28 Jun 2024 • De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Shuang-Yi Wang, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Bo-Xian Yao, Zeng-Guang Hou
Automatic vessel segmentation is paramount for developing next-generation interventional navigation systems.
1 code implementation • 27 Jun 2024 • Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Yajing Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, Jing Qin, Xiahai Zhuang, Claudia Prieto, Alistair Young, Michael Markl, He Wang, Lianming Wu, Guang Yang, Xiaobo Qu, Chengyan Wang
To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset.
no code implementations • 26 Jun 2024 • Hao Li, Ming Yuan, Yan Zhang, Chenming Wu, Chen Zhao, Chunyu Song, Haocheng Feng, Errui Ding, Dingwen Zhang, Jingdong Wang
To address this, this paper presents a novel driving view synthesis dataset and benchmark specifically designed for autonomous driving simulations.
no code implementations • 26 Jun 2024 • Hao Li, Jingfeng Li, Dingwen Zhang, Chenming Wu, Jieqi Shi, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, Junwei Han
Dynamic Gaussian splatting has led to impressive scene reconstruction and image synthesis advances in novel views.
1 code implementation • 24 Jun 2024 • Zhiyu Tan, Xiaomeng Yang, Luozheng Qin, Mengping Yang, Cheng Zhang, Hao Li
Our evaluation across 24 text-to-image generation models demonstrate that EvalAlign not only provides superior metric stability but also aligns more closely with human preferences than existing metrics, confirming its effectiveness and utility in model assessment.
1 code implementation • 22 Jun 2024 • Han Liu, Hao Li, Jiacheng Wang, Yubo Fan, Zhoubing Xu, Ipek Oguz
Recently, in silico labeling has emerged as a promising alternative, aiming to use machine learning models to directly predict the fluorescently labeled images from label-free microscopy.
no code implementations • 19 Jun 2024 • Yuhan Zhu, Jian Wang, Bing Li, Xuxian Tang, Hao Li, Neng Zhang, Yuqi Zhao
Experiments conducted on the dataset collected from the benchmark show that MicroCERCL can accurately localize the root cause of microservice systems in such environments, significantly outperforming state-of-the-art approaches with an increase of at least 24. 1% in top-1 accuracy.
1 code implementation • 18 Jun 2024 • Haoyu Li, Baochen Yang, Yu Xi, Linfeng Yu, Tian Tan, Hao Li, Kai Yu
TPDT-SS shows remarkable success in addressing permutation problems in mixed keyword speech, thereby greatly boosting the performance of the backend.
no code implementations • 15 Jun 2024 • Ying Fu, Yu Li, ShaoDi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu, Yunkang Zhang, Siyuan Jiang, Xiaoqiang Lu, Licheng Jiao, Fang Liu, Xu Liu, Lingling Li, Wenping Ma, Shuyuan Yang, Haiyang Xie, Jian Zhao, Shihua Huang, Peng Cheng, Xi Shen, Zheng Wang, Shuai An, Caizhi Zhu, Xuelong Li, Tao Zhang, Liang Li, Yu Liu, Chenggang Yan, Gengchen Zhang, Linyan Jiang, Bingyi Song, Zhuoyu An, Haibo Lei, Qing Luo, Jie Song, YuAn Liu, Haoyuan Zhang, Lingfeng Wang, Wei Chen, Aling Luo, Cheng Li, Jun Cao, Shu Chen, Zifei Dou, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Xuejian Gou, Qinliang Wang, Yang Liu, Shizhan Zhao, Yanzhao Zhang, Libo Yan, Yuwei Guo, Guoxin Li, Qiong Gao, Chenyue Che, Long Sun, Xiang Chen, Hao Li, Jinshan Pan, Chuanlong Xie, Hongming Chen, Mingrui Li, Tianchen Deng, Jingwei Huang, Yufeng Li, Fei Wan, Bingxin Xu, Jian Cheng, Hongzhe Liu, Cheng Xu, Yuxiang Zou, Weiguo Pan, Songyin Dai, Sen Jia, Junpei Zhang, Puhua Chen, Qihang Li
The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies.
1 code implementation • 9 Jun 2024 • Hao Li, Chenghao Yang, An Zhang, Yang Deng, Xiang Wang, Tat-Seng Chua
Crucial to addressing this real-world need are event summary and persona management, which enable reasoning for appropriate long-term dialogue responses.
1 code implementation • 6 Jun 2024 • Xizhou Zhu, Xue Yang, Zhaokai Wang, Hao Li, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai
Our core idea is to use models with different parameter sizes to process different resolution levels of the image pyramid, thereby balancing computational efficiency and performance.
2 code implementations • 5 Jun 2024 • Hao Li, Yuping Wu, Viktor Schlegel, Riza Batista-Navarro, Tharindu Madusanka, Iqra Zahid, Jiayan Zeng, Xiaochi Wang, Xinran He, Yizhi Li, Goran Nenadic
In our work, we introduce an argument mining dataset that captures the end-to-end process of preparing an argumentative essay for a debate, which covers the tasks of claim and evidence identification (Task 1 ED), evidence convincingness ranking (Task 2 ECR), argumentative essay summarisation and human preference ranking (Task 3 ASR) and metric learning for automated evaluation of resulting essays, based on human feedback along argument quality dimensions (Task 4 SQE).
no code implementations • 1 Jun 2024 • Jiawei Cao, Chongtao Guo, Hao Li, Zhigang Wang, Houjun Wang, Geoffrey Ye Li
In this paper, we propose a deep learning based performance testing framework to minimize the number of required test modules while guaranteeing the accuracy requirement, where a test module corresponds to a combination of one circuit and one stimulus.
no code implementations • 25 May 2024 • Phong Tran, Egor Zakharov, Long-Nhat Ho, Liwen Hu, Adilbek Karmanov, Aviral Agarwal, Mclean Goldwhite, Ariana Bermudez Venegas, Anh Tuan Tran, Hao Li
We further integrate our novel head reenactment solution into an accessible high-fidelity VR telepresence system, where any person can instantly build a personalized neural head avatar from any photo and bring it to life using the headset.
1 code implementation • 22 May 2024 • Dingwen Zhang, Hao Li, Diqi He, Nian Liu, Lechao Cheng, Jingdong Wang, Junwei Han
Experimental evaluations conducted on MS COCO, Cityscapes, and CTW1500 datasets indicate that the QEIS models' performance can be significantly improved when pre-trained with our method.
1 code implementation • 21 May 2024 • Zhiyu Tan, Mengping Yang, Luozheng Qin, Hao Yang, Ye Qian, Qiang Zhou, Cheng Zhang, Hao Li
Moreover, the model capacity of the text encoder from CLIP is relatively limited compared to Large Language Models (LLMs), which offer multilingual input, accommodate longer context, and achieve superior text representation.
no code implementations • 14 May 2024 • Hao Li, Lipo Wang, Tianyun Zhao, Wei Zhao
Image stitching aims to construct a wide field of view with high spatial resolution, which cannot be achieved in a single exposure.
no code implementations • 10 May 2024 • Mengjia Niu, Hao Li, Jie Shi, Hamed Haddadi, Fan Mo
Large language models (LLMs) have demonstrated remarkable capabilities across various domains, although their susceptibility to hallucination poses significant challenges for their deployment in critical areas such as healthcare.
no code implementations • 9 May 2024 • Xiaohui Zhong, Lei Chen, Hao Li, Jun Liu, Xu Fan, Jie Feng, Kan Dai, Jing-Jia Luo, Jie Wu, Bo Lu
This innovative approach makes FuXi-ENS an advancement over the traditional ones that use L1 loss combined with the KL loss in standard VAE models for ensemble weather forecasting.
2 code implementations • 23 Apr 2024 • Hao Li, Han Liu, Dewei Hu, Jiacheng Wang, Ipek Oguz
(3) Corrective learning.
1 code implementation • 23 Apr 2024 • Muhammad Ahmad, Salvatore Distifano, Adil Mehmood Khan, Manuel Mazzara, Chenyu Li, Hao Li, Jagannath Aryal, Yao Ding, Gemine Vivone, Danfeng Hong
Hyperspectral Image Classification (HSC) presents significant challenges owing to the high dimensionality and intricate nature of Hyperspectral (HS) data.
1 code implementation • 21 Apr 2024 • Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Hao Li, Ming Tang, Jinqiao Wang
Zero-shot anomaly detection (ZSAD) methods entail detecting anomalies directly without access to any known normal or abnormal samples within the target item categories.
no code implementations • 14 Apr 2024 • Haosong Peng, Wei Feng, Hao Li, Yufeng Zhan, Ren Jin, Yuanqing Xia
Through extensive experiments, our findings reveal that Arena can boost inference speeds by up to 1. 58\(\times\) and 1. 82\(\times\) on average while consuming only 47\% and 31\% of the bandwidth, respectively, all with high inference accuracy.
1 code implementation • 12 Apr 2024 • Xiaoze Xu, Xiuyu Sun, Wei Han, Xiaohui Zhong, Lei Chen, Hao Li
Data assimilation (DA), as an indispensable component within contemporary Numerical Weather Prediction (NWP) systems, plays a crucial role in generating the analysis that significantly impacts forecast performance.
no code implementations • 8 Apr 2024 • Jingxin Wang, Renxiang Guan, Kainan Gao, Zihao Li, Hao Li, Xianju Li, Chang Tang
Multi-level graph subspace contrastive learning: multi-level contrastive learning was conducted to obtain local-global joint graph representations, to improve the consistency of the positive samples between views, and to obtain more robust graph embeddings.
1 code implementation • 8 Apr 2024 • Shen Gao, Hao Li, Chengrui Huang, Quan Tu, Zhiliang Tian, Minlie Huang, Shuo Shang
The framework employs a novel 360$^\circ$ performance assessment method for multi-perspective performance evaluation with fine-grained assessment.
1 code implementation • 6 Apr 2024 • Hao Li, Xiang Chen, Jiangxin Dong, Jinhui Tang, Jinshan Pan
However, inaccurate alignment usually leads to aligned features with significant artifacts, which will be accumulated during propagation and thus affect video restoration.
Ranked #5 on Video Super-Resolution on Vid4 - 4x upscaling
no code implementations • CVPR 2024 • Hao Li, Yang Zou, Ying Wang, Orchid Majumder, Yusheng Xie, R. Manmatha, Ashwin Swaminathan, Zhuowen Tu, Stefano Ermon, Stefano Soatto
On the data scaling side, we show the quality and diversity of the training set matters more than simply dataset size.
no code implementations • CVPR 2024 • Sicheng Li, Hao Li, Yiyi Liao, Lu Yu
The emergence of Neural Radiance Fields (NeRF) has greatly impacted 3D scene modeling and novel-view synthesis.
1 code implementation • 1 Apr 2024 • Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Liping Zhang, Weitian Chen, Yidong Zhao, Qian Tao, Yanwei Pang, Xiaohan Liu, Artem Razumov, Dmitry V. Dylov, Quan Dou, Kang Yan, Yuyang Xue, Yuning Du, Julia Dietlmeier, Carles Garcia-Cabrera, Ziad Al-Haj Hemidi, Nora Vogt, Ziqiang Xu, Yajing Zhang, Ying-Hua Chu, Weibo Chen, Wenjia Bai, Xiahai Zhuang, Jing Qin, Lianmin Wu, Guang Yang, Xiaobo Qu, He Wang, Chengyan Wang
To address this issue, we organized the Cardiac MRI Reconstruction Challenge (CMRxRecon) in 2023, in collaboration with the 26th International Conference on MICCAI.
no code implementations • 25 Mar 2024 • Yang Luo, Xiqing Guo, Hao Li
Due to the complementary nature of visible light and thermal infrared modalities, object tracking based on the fusion of visible light images and thermal images (referred to as RGB-T tracking) has received increasing attention from researchers in recent years.
Ranked #5 on Rgb-T Tracking on GTOT
no code implementations • 20 Mar 2024 • Yu Xi, Hao Li, Baochen Yang, Haoyu Li, Hainan Xu, Kai Yu
Designing an efficient keyword spotting (KWS) system that delivers exceptional performance on resource-constrained edge devices has long been a subject of significant attention.
no code implementations • 15 Mar 2024 • Hao Li, Yuanyuan Gao, Chenming Wu, Dingwen Zhang, Yalun Dai, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, Junwei Han
Specifically, we design a novel joint learning framework that consists of an Iterative Pose Optimization Network (IPO-Net) and a Generalizable 3D-Gaussians (G-3DG) model.
1 code implementation • 14 Mar 2024 • Xiangrui Cai, Yang Wang, Sihan Xu, Hao Li, Ying Zhang, Zheli Liu, Xiaojie Yuan
Moreover, LAN can be also applied to post-hoc ITD, surpassing 8 competitive baselines by at least 7. 70% and 4. 03% in AUC on two datasets.
no code implementations • CVPR 2024 • QiHao Zhao, Yalun Dai, Hao Li, Wei Hu, Fan Zhang, Jun Liu
Long-tail recognition is challenging because it requires the model to learn good representations from tail categories and address imbalances across all categories.
no code implementations • CVPR 2024 • Junyan Wang, Zhenhong Sun, Zhiyu Tan, Xuanbai Chen, Weihua Chen, Hao Li, Cheng Zhang, Yang song
Vanilla text-to-image diffusion models struggle with generating accurate human images, commonly resulting in imperfect anatomies such as unnatural postures or disproportionate limbs. Existing methods address this issue mostly by fine-tuning the model with extra images or adding additional controls -- human-centric priors such as pose or depth maps -- during the image generation phase.
1 code implementation • CVPR 2024 • Hao Li, Ying Chen, Yifei Chen, Wenxian Yang, Bowen Ding, Yuchen Han, Liansheng Wang, Rongshan Yu
It is designed to enhance the model's generalizability by leveraging the interaction between localized visual patterns and fine-grained pathological semantics.
1 code implementation • 26 Feb 2024 • Zhaopeng Feng, Yan Zhang, Hao Li, Bei Wu, Jiayu Liao, Wenqiang Liu, Jun Lang, Yang Feng, Jian Wu, Zuozhu Liu
Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT).
no code implementations • 25 Feb 2024 • Hao Wang, Hao Li, Minlie Huang, Lei Sha
In addition, our approach can be generalized into a broader method for generating transferable adversarial suffixes that can successfully attack multiple LLMs, even black-box LLMs, such as ChatGPT and Gemini.
no code implementations • 22 Feb 2024 • Hao Li, Mengqi Huang, Lei Zhang, Bo Hu, Yi Liu, Zhendong Mao
GAN-based image attribute editing firstly leverages GAN Inversion to project real images into the latent space of GAN and then manipulates corresponding latent codes.
no code implementations • 20 Feb 2024 • Yizhi Li, Ge Zhang, Xingwei Qu, Jiali Li, Zhaoqun Li, Zekun Wang, Hao Li, Ruibin Yuan, Yinghao Ma, Kai Zhang, Wangchunshu Zhou, Yiming Liang, Lei Zhang, Lei Ma, Jiajun Zhang, Zuowen Li, Stephen W. Huang, Chenghua Lin, Jie Fu
The advancement of large language models (LLMs) has enhanced the ability to generalize across a wide range of unseen natural language processing (NLP) tasks through instruction-following.
no code implementations • 2 Feb 2024 • Hao Li, Wei Wang, Cong Wang, Zhigang Luo, Xinwang Liu, Kenli Li, Xiaochun Cao
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains using only data from a single source domain during training.
no code implementations • 22 Jan 2024 • De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Xiu-Ling Liu, Zeng-Guang Hou
2. 5D-based segmentation models bridge computational efficiency of 2D-based models and spatial perception capabilities of 3D-based models.
1 code implementation • 18 Jan 2024 • Zesen Cheng, Kehan Li, Hao Li, Peng Jin, Chang Liu, Xiawu Zheng, Rongrong Ji, Jie Chen
To mold instance queries to follow Brownian bridge and accomplish alignment with class texts, we design Bridge-Text Alignment (BTA) to learn discriminative bridge-level representations of instances via contrastive objectives.
1 code implementation • 18 Jan 2024 • Hao Li, Gopi Krishnan Rajbahadur, Dayi Lin, Cor-Paul Bezemer, Zhen Ming, Jiang
This classifier is then used to detect if a trained model is overfit.
no code implementations • 15 Jan 2024 • Zhifeng Xie, Hao Li, Huiming Ding, Mengtian Li, Ying Cao
Cross-modal fashion synthesis and editing offer intelligent support to fashion designers by enabling the automatic generation and local modification of design drafts. While current diffusion models demonstrate commendable stability and controllability in image synthesis, they still face significant challenges in generating fashion design from abstract design elements and fine-grained editing. Abstract sensory expressions, \eg office, business, and party, form the high-level design concepts, while measurable aspects like sleeve length, collar type, and pant length are considered the low-level attributes of clothing. Controlling and editing fashion images using lengthy text descriptions poses a difficulty. In this paper, we propose HieraFashDiff, a novel fashion design method using the shared multi-stage diffusion model encompassing high-level design concepts and low-level clothing attributes in a hierarchical structure. Specifically, we categorized the input text into different levels and fed them in different time step to the diffusion model according to the criteria of professional clothing designers. HieraFashDiff allows designers to add low-level attributes after high-level prompts for interactive editing incrementally. In addition, we design a differentiable loss function in the sampling process with a mask to keep non-edit areas. Comprehensive experiments performed on our newly conducted Hierarchical fashion dataset, demonstrate that our proposed method outperforms other state-of-the-art competitors.
no code implementations • 12 Jan 2024 • Yu Xi, Baochen Yang, Hao Li, Jiaqi Guo, Kai Yu
Furthermore, experiments on the continuous speech dataset LibriSpeech demonstrate that, by incorporating audio discrimination, CLAD achieves significant performance gain over CL without audio discrimination.
no code implementations • 27 Dec 2023 • Di Chen, Hao Li, Zhicheng Jin, Huizhao Tu, Meixin Zhu
Autonomous vehicles (AVs) have the potential to prevent accidents caused by drivers errors and reduce road traffic risks.
no code implementations • 25 Dec 2023 • Hao Li, Hanqi Tao, Wentao Huang, Hongcai Zhang, Ran Li
Despite the success of electric vehicles on land, electrification of maritime ships is challenged by the dilemma of range anxiety and cargo-carrying capacity.
1 code implementation • 22 Dec 2023 • Qianrui Zhou, Hua Xu, Hao Li, Hanlei Zhang, Xiaohan Zhang, Yifan Wang, Kai Gao
To establish an optimal multimodal semantic environment for text modality, we develop a modality-aware prompting module (MAP), which effectively aligns and fuses features from text, video and audio modalities with similarity-based modality alignment and cross-modality attention mechanism.
Ranked #2 on Multimodal Intent Recognition on MIntRec
no code implementations • 15 Dec 2023 • Lei Chen, Xiaohui Zhong, Hao Li, Jie Wu, Bo Lu, Deliang Chen, Shangping Xie, Qingchen Chao, Chensen Lin, Zixin Hu, Yuan Qi
Skillful subseasonal forecasts are crucial for various sectors of society but pose a grand scientific challenge.
no code implementations • CVPR 2024 • Hao Li, Xue Yang, Zhaokai Wang, Xizhou Zhu, Jie zhou, Yu Qiao, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai
Many reinforcement learning environments (e. g., Minecraft) provide only sparse rewards that indicate task completion or failure with binary values.
1 code implementation • CVPR 2024 • Yuzhe Zhang, Jiawei Zhang, Hao Li, Zhouxia Wang, Luwei Hou, Dongqing Zou, Liheng Bian
Since text prior is important to guarantee the correctness of the restored text structure according to existing arts, we also propose a Text Diffusion Model (TDM) for text recognition which can guide IDM to generate text images with correct structures.
2 code implementations • 10 Dec 2023 • Xu Zhang, Hao Li, Mang Ye
Since clean samples are easier distinguished by GMM with increasing noise, the memory bank can still maintain high quality at a high noise ratio.
Cross-modal retrieval with noisy correspondence Image-text matching +3
no code implementations • 10 Dec 2023 • Hao Li, Brandon Bennett
Text classification is an important topic in the field of natural language processing.
1 code implementation • 7 Dec 2023 • Xiao-Yin Liu, Xiao-Hu Zhou, Guotao Li, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou
This method trades off performance and robustness via introducing the robust Bellman operator into the algorithm.
no code implementations • CVPR 2024 • Phong Tran, Egor Zakharov, Long-Nhat Ho, Anh Tuan Tran, Liwen Hu, Hao Li
We present a 3D-aware one-shot head reenactment method based on a fully volumetric neural disentanglement framework for source appearance and driver expressions.
1 code implementation • 5 Dec 2023 • Hao Li, Curise Jia, Peng Jin, Zesen Cheng, Kehan Li, Jialu Sui, Chang Liu, Li Yuan
In this paper, we propose the Style-Diversified Query-Based Image Retrieval task, which enables retrieval based on various query styles.
1 code implementation • 30 Nov 2023 • Guangming Zhu, Siyuan Wang, Qing Cheng, Kelong Wu, Hao Li, Liang Zhang
With the recent surge in the use of touchscreen devices, free-hand sketching has emerged as a promising modality for human-computer interaction.
1 code implementation • 21 Nov 2023 • Jiacheng Wang, Hao Li, Dewei Hu, Yuankai K. Tao, Ipek Oguz
High-resolution Optical Coherence Tomography (OCT) images are crucial for ophthalmology studies but are limited by their relatively narrow field of view (FoV).
1 code implementation • 20 Nov 2023 • Ling Luo, Jinzhong Ning, Yingwen Zhao, Zhijun Wang, Zeyuan Ding, Peng Chen, Weiru Fu, Qinyu Han, Guangtao Xu, Yunzhi Qiu, Dinghao Pan, Jiru Li, Hao Li, Wenduo Feng, Senbo Tu, Yuqi Liu, Zhihao Yang, Jian Wang, Yuanyuan Sun, Hongfei Lin
The case study involving additional biomedical NLP tasks further shows Taiyi's considerable potential for bilingual biomedical multi-tasking.
no code implementations • CVPR 2024 • Hao Li, Dingwen Zhang, Yalun Dai, Nian Liu, Lechao Cheng, Jingfeng Li, Jingdong Wang, Junwei Han
Applying NeRF to downstream perception tasks for scene understanding and representation is becoming increasingly popular.
no code implementations • 13 Nov 2023 • Danfeng Hong, Bing Zhang, Xuyang Li, YuXuan Li, Chenyu Li, Jing Yao, Naoto Yokoya, Hao Li, Pedram Ghamisi, Xiuping Jia, Antonio Plaza, Paolo Gamba, Jon Atli Benediktsson, Jocelyn Chanussot
The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner.
1 code implementation • 13 Nov 2023 • Hao Li, Han Liu, Dewei Hu, Jiacheng Wang, Ipek Oguz
In this paper, we assess the test-time variability for interactive medical image segmentation with diverse point prompts.
1 code implementation • 12 Nov 2023 • Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi
Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.
Ranked #140 on Visual Question Answering on MM-Vet