no code implementations • 5 May 2025 • Qingqiu Li, Zihang Cui, Seongsu Bae, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng, Quanli Shen, Xiaobo Zhang, Junjun He, Shujun Wang
Chest X-rays (CXRs) are the most frequently performed imaging examinations in clinical settings.
no code implementations • 16 Apr 2025 • Jilan Xu, Yifei HUANG, Baoqi Pei, Junlin Hou, Qingqiu Li, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie
Next, we employ a video diffusion model to predict future ego-frames using the first ego-frame and textual instructions, while incorporating the HOI masks as structural guidance to enhance prediction quality.
no code implementations • 26 Feb 2025 • Anton Alyakin, Jaden Stryker, Daniel Alexander Alber, Karl L. Sangwon, Jin Vivian Lee, Brandon Duderstadt, Akshay Save, David Kurland, Spencer Frome, Shrutika Singh, Jeff Zhang, Eunice Yang, Ki Yun Park, Cordelia Orillac, Aly A. Valliani, Sean Neifert, Albert Liu, Aneek Patel, Christopher Livia, Darryl Lau, Ilya Laufer, Peter A. Rozman, Eveline Teresa Hidalgo, Howard Riina, Rui Feng, Todd Hollon, Yindalon Aphinyanaphongs, John G. Golfinos, Laura Snyder, Eric Leuthardt, Douglas Kondziolka, Eric Karl Oermann
Using NeuroPubs, VLMs generated publication-ready graphical abstracts (70% of 100 abstracts) and board-style questions indistinguishable from human-written ones (54% of 89, 587 questions).
no code implementations • 16 Feb 2025 • Runtian Yuan, Jilan Xu, Mohan Chen, Qingqiu Li, Yuejie Zhang, Rui Feng, Tao Zhang, Shang Gao
We develop a strong baseline model, Text-Promptable Propagation (TPP), designed to exploit the intrinsic relationships among sequential images and their associated textual descriptions.
1 code implementation • 16 Jan 2025 • Hongzhou Yu, Tianhao Cheng, Ying Cheng, Rui Feng
We proposed FineMedLM-o1, which leverages high-quality synthetic medical data and long-form reasoning data for Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO), enabling advanced dialogue and deep reasoning capabilities.
no code implementations • 9 Dec 2024 • Yin-Long Liu, Rui Feng, Jia-Hong Yuan, Zhen-Hua Ling
Compared to other clinical screening techniques, speech-and-language-based automated Alzheimer's disease (AD) detection methods are characterized by their non-invasiveness, cost-effectiveness, and convenience.
Alzheimer's Disease Detection
Automatic Speech Recognition
+6
no code implementations • 1 Nov 2024 • Fan Xiao, Junlin Hou, RuiWei Zhao, Rui Feng, Haidong Zou, Lina Lu, Yi Xu, Juzhao Zhang
Diabetic retinopathy (DR) is a leading cause of blindness worldwide and a common complication of diabetes.
no code implementations • 28 Oct 2024 • Bowen Zhao, Tianhao Cheng, Yuejie Zhang, Ying Cheng, Rui Feng, Xiaobo Zhang
Most existing researches in MMQA only focus on two modalities such as image-text QA, table-text QA and chart-text QA, and there remains a notable scarcity in studies that investigate the joint analysis of text, tables, and charts.
no code implementations • 4 Sep 2024 • Weiwei Tian, Xinyu Huang, Tianhao Cheng, Wen He, Jinwu Fang, Rui Feng, Daoying Geng, Xiaobo Zhang
This dataset comprised 2D chest X-ray images, 3D chest CT images, corresponding radiology reports, and outpatient and inpatient records.
no code implementations • 15 Jun 2024 • Ruize Wang, Hui Xu, Ying Cheng, Qi He, Xing Zhou, Rui Feng, Wei Xu, Lei Huang, Jie Jiang
We introduce a gain evaluation strategy to calculate information gain, aiding the model in learning helpful information for the target domain and providing the ability to reject noisy samples, thus avoiding negative transfer.
no code implementations • 11 Jun 2024 • Yin-Long Liu, Rui Feng, Jia-Hong Yuan, Zhen-Hua Ling
We uncover an underlying bias present in the audio recordings produced from the picture description task of the Pitt corpus, the largest publicly accessible database for Alzheimer's Disease (AD) detection research.
no code implementations • 8 Apr 2024 • Junlin Hou, Jilan Xu, Rui Feng, Hao Chen
Previous noise learning methods mainly considered noise arising from images being mislabeled, i. e. label noise, assuming that all mislabeled images are of high image quality.
no code implementations • 18 Mar 2024 • Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen
In response to the need for rapid and accurate COVID-19 diagnosis during the global pandemic, we present a two-stage framework that leverages pseudo labels for domain adaptation to enhance the detection of COVID-19 from CT scans.
no code implementations • 18 Mar 2024 • Qingqiu Li, Runtian Yuan, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen
To make a more accurate diagnosis of COVID-19, we propose a straightforward yet effective model.
no code implementations • 14 Mar 2024 • Qingqiu Li, Xiaohan Yan, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng, Quanli Shen, Xiaobo Zhang, Shujun Wang
For finding and existence, we regard them as image tags, applying an image-tag recognition decoder to associate image features with their respective tags within each sample and constructing soft labels for contrastive learning to improve the semantic association of different image-report pairs.
no code implementations • CVPR 2024 • Jilan Xu, Yifei HUANG, Junlin Hou, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie
In this paper, (1) we develop EgoInstructor, a retrieval-augmented multimodal captioning model that automatically retrieves semantically relevant third-person instructional videos to enhance the video captioning of egocentric videos.
no code implementations • 13 Dec 2023 • Bowen Zhao, Changkai Ji, Yuejie Zhang, Wen He, Yingwen Wang, Qing Wang, Rui Feng, Xiaobo Zhang
With the Generative Pre-trained Transformer 3. 5 (GPT-3. 5) exhibiting remarkable reasoning and comprehension abilities in Natural Language Processing (NLP), most Question Answering (QA) research has primarily centered around general QA tasks based on GPT, neglecting the specific challenges posed by Complex Table QA.
no code implementations • 5 Dec 2023 • Xiaze Zhang, Ziheng Ding, Qi Jing, Yuejie Zhang, Wenchao Ding, Rui Feng
Point clouds have shown significant potential in various domains, including Simultaneous Localization and Mapping (SLAM).
no code implementations • 1 Dec 2023 • Ji Bian, Jian Sun, Cheng-Xiang Wang, Rui Feng, Jie Huang, Yang Yang, Minggao Zhang
In this paper, a three-dimensional (3-D) non-stationary wideband multiple-input multiple-output (MIMO) channel model based on the WINNER+ channel model is proposed.
no code implementations • 1 Nov 2023 • Qingqiu Li, Jilan Xu, Runtian Yuan, Mohan Chen, Yuejie Zhang, Rui Feng, Xiaobo Zhang, Shang Gao
Automatic generation of radiology reports holds crucial clinical value, as it can alleviate substantial workload on radiologists and remind less experienced ones of potential anomalies.
2 code implementations • 23 Oct 2023 • Xinyu Huang, Yi-Jie Huang, Youcai Zhang, Weiwei Tian, Rui Feng, Yuejie Zhang, Yanchun Xie, Yaqian Li, Lei Zhang
Specifically, for predefined commonly used tag categories, RAM++ showcases 10. 2 mAP and 15. 4 mAP enhancements over CLIP on OpenImages and ImageNet.
no code implementations • 1 Sep 2023 • Rui Feng, Huan Tran, Aubrey Toland, Binghong Chen, Qi Zhu, Rampi Ramprasad, Chao Zhang
Machine learning (ML) forcefields have been developed to achieve both the accuracy of ab initio methods and the efficiency of empirical force fields.
no code implementations • 5 Apr 2023 • Bo Qian, Hao Chen, Xiangning Wang, Haoxuan Che, Gitaek Kwon, Jaeyoung Kim, Sungjin Choi, Seoyoung Shin, Felix Krause, Markus Unterdechler, Junlin Hou, Rui Feng, Yihao Li, Mostafa El Habib Daho, Qiang Wu, Ping Zhang, Xiaokang Yang, Yiyu Cai, Weiping Jia, Huating Li, Bin Sheng
Computer-assisted automatic analysis of diabetic retinopathy (DR) is of great importance in reducing the risks of vision loss and even blindness.
2 code implementations • 10 Mar 2023 • Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang
This paper presents Tag2Text, a vision language pre-training (VLP) framework, which introduces image tagging into vision-language models to guide the learning of visual-linguistic features.
1 code implementation • CVPR 2023 • Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng, Yi Wang, Yu Qiao, Weidi Xie
The former aims to infer all masked entities in the caption given the group tokens, that enables the model to learn fine-grained alignment between visual groups and text entities.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
+1
no code implementations • 26 Nov 2022 • Junlin Hou, Jilan Xu, Nan Zhang, Yuejie Zhang, Xiaobo Zhang, Rui Feng
In our approach, we devise a novel infection-aware 3D Contrastive Mixup Classification network for severity grading.
no code implementations • 26 Nov 2022 • Junlin Hou, Jilan Xu, Nan Zhang, Yi Wang, Yuejie Zhang, Xiaobo Zhang, Rui Feng
This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop at the European Conference on Computer Vision (ECCV 2022).
1 code implementation • 26 Nov 2022 • Junlin Hou, Jilan Xu, Fan Xiao, Rui-Wei Zhao, Yuejie Zhang, Haidong Zou, Lina Lu, Wenwen Xue, Rui Feng
However, automatic DR grading based on two-field fundus photography remains a challenging task due to the lack of publicly available datasets and effective fusion strategies.
1 code implementation • 25 Nov 2022 • Lingkai Kong, Jiaming Cui, Yuchen Zhuang, Rui Feng, B. Aditya Prakash, Chao Zhang
Decision-focused learning (DFL) was recently proposed for stochastic optimization problems that involve unknown parameters.
1 code implementation • 2 Oct 2022 • Junlin Hou, Fan Xiao, Jilan Xu, Yuejie Zhang, Haidong Zou, Rui Feng
In the image quality assessment task, we create an ensemble of InceptionV3, SE-ResNeXt, and Vision Transformer models.
1 code implementation • 12 Jul 2022 • Xinyu Huang, Youcai Zhang, Ying Cheng, Weiwei Tian, RuiWei Zhao, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Xiaobo Zhang
However, the image-text pairs co-occurrent on the Internet typically lack explicit alignment information, which is suboptimal for VLP.
1 code implementation • 12 Jul 2022 • Jiashuo Yu, Jinyu Liu, Ying Cheng, Rui Feng, Yuejie Zhang
In this paper, we analyze the modality asynchrony and undifferentiated instances phenomena of the multiple instance learning (MIL) procedure, and further investigate its negative impact on weakly-supervised audio-visual learning.
Ranked #10 on
Anomaly Detection In Surveillance Videos
on XD-Violence
Anomaly Detection In Surveillance Videos
audio-visual learning
+1
no code implementations • 7 Jul 2022 • Jiashuo Yu, Junfu Pu, Ying Cheng, Rui Feng, Ying Shan
Although audio-visual representation has been proved to be applicable in many downstream tasks, the representation of dancing videos, which is more specific and always accompanied by music with complex auditory contents, remains challenging and uninvestigated.
no code implementations • 5 Jul 2022 • Junlin Hou, Jilan Xu, Rui Feng, Yuejie Zhang
This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop in the European Conference on Computer Vision (ECCV 2022).
1 code implementation • 21 Jun 2022 • Shuaicheng Li, Feng Zhang, Rui-Wei Zhao, Rui Feng, Kunlin Yang, Lingbo Liu, Jun Hou
Based on PRSlot modules, we present a novel Pyramid Region-based Slot Attention Network termed PRSA-Net to learn a unified visual representation with rich temporal and semantic context for better proposal generation.
no code implementations • 6 Jun 2022 • Jie Huang, Cheng-Xiang Wang, Yingzhuo Sun, Rui Feng, Jialing Huang, Bolun Guo, Zhimeng Zhong, Tie Jun Cui
Reconfigurable intelligent surfaces (RISs) are two dimensional (2D) metasurfaces which can intelligently manipulate electromagnetic waves by low-cost near passive reflecting elements.
1 code implementation • CVPR 2022 • Jilan Xu, Junlin Hou, Yuejie Zhang, Rui Feng, Rui-Wei Zhao, Tao Zhang, Xuequan Lu, Shang Gao
In this paper, we empirically prove that this problem is associated with the mixup of the activation values between less discriminative foreground regions and the background.
no code implementations • 10 Apr 2022 • Jinyu Liu, Ying Cheng, Yuejie Zhang, Rui-Wei Zhao, Rui Feng
Visual-only self-supervised learning has achieved significant improvement in video representation learning.
no code implementations • NAACL 2022 • Rui Feng, Chen Luo, Qingyu Yin, Bing Yin, Tuo Zhao, Chao Zhang
User sessions empower many search and recommendation tasks on a daily basis.
2 code implementations • 13 Dec 2021 • Youcai Zhang, Yuhao Cheng, Xinyu Huang, Fei Wen, Rui Feng, Yaqian Li, Yandong Guo
Multi-label learning in the presence of missing labels (MLML) is a challenging problem.
1 code implementation • 24 Nov 2021 • Jiashuo Yu, Ying Cheng, Rui-Wei Zhao, Rui Feng, Yuejie Zhang
Recognizing and localizing events in videos is a fundamental task for video understanding.
no code implementations • 7 Apr 2021 • Jiashuo Yu, Ying Cheng, Rui Feng
The localization subnetwork consists of Multimodal Bottleneck Attention Module (MBAM), which is designed to extract fine-grained segment-level contents.
no code implementations • 21 Oct 2020 • Rui Feng, Jie Yuan, Chao Zhang
We argue that the event extraction models so trained are inherently label-hungry, and can generalize poorly across domains and text genres. We propose a reading comprehension framework for event extraction. Specifically, we formulate event detection as a textual entailment prediction problem, and argument detection as a question answer-ing problem.
1 code implementation • 5 Oct 2020 • Yinghao Li, Rui Feng, Isaac Rehg, Chao Zhang
We study the problem of using (partial) constituency parse trees as syntactic guidance for controlled text generation.
no code implementations • 13 Aug 2020 • Ying Cheng, Ruize Wang, Zhihao Pan, Rui Feng, Yuejie Zhang
When watching videos, the occurrence of a visual event is often accompanied by an audio event, e. g., the voice of lip motion, the music of playing instruments.
no code implementations • 19 Jul 2020 • Wenxi Li, Zhuoqun Cao, Qian Wang, Songjian Chen, Rui Feng
Density regression has been widely employed in crowd counting.
1 code implementation • 30 Apr 2019 • Rui Feng, Yang Yang, Yuehan Lyu, Chenhao Tan, Yizhou Sun, Chunping Wang
Fairness has become a central issue for our research community as classification algorithms are adopted in societally critical domains such as recidivism prediction and loan approval.
no code implementations • 29 Nov 2017 • Rui Feng, Yang Yang, Wenjie Hu, Fei Wu, Yueting Zhuang
Existing network embedding works primarily focus on preserving the microscopic structure, such as the first- and second-order proximity of vertexes, while the macroscopic scale-free property is largely ignored.