no code implementations • 28 Sep 2023 • Shiyi Tang, Yini Fang, Shu Zhang
Small object detection has been a challenging problem in the field of object detection.
1 code implementation • 25 Jul 2023 • Zhengliang Liu, Tianyang Zhong, Yiwei Li, Yutong Zhang, Yi Pan, Zihao Zhao, Peixin Dong, Chao Cao, Yuxiao Liu, Peng Shu, Yaonai Wei, Zihao Wu, Chong Ma, Jiaqi Wang, Sheng Wang, Mengyue Zhou, Zuowei Jiang, Chunlin Li, Jason Holmes, Shaochen Xu, Lu Zhang, Haixing Dai, Kai Zhang, Lin Zhao, Yuanhao Chen, Xu Liu, Peilong Wang, Pingkun Yan, Jun Liu, Bao Ge, Lichao Sun, Dajiang Zhu, Xiang Li, Wei Liu, Xiaoyan Cai, Xintao Hu, Xi Jiang, Shu Zhang, Xin Zhang, Tuo Zhang, Shijie Zhao, Quanzheng Li, Hongtu Zhu, Dinggang Shen, Tianming Liu
The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP).
1 code implementation • 12 Jul 2023 • Shengbo Gao, Ziji Zhang, Jiechao Ma, Zihao Li, Shu Zhang
Our approach is based on a mutual learning strategy that incorporates two modules: the Cross-sample Mutual Attention Module (CMA) and the Omni-Correlation Consistency Module (OCC).
no code implementations • 3 Jul 2023 • Jiaqi Wang, Zhengliang Liu, Lin Zhao, Zihao Wu, Chong Ma, Sigang Yu, Haixing Dai, Qiushi Yang, Yiheng Liu, Songyao Zhang, Enze Shi, Yi Pan, Tuo Zhang, Dajiang Zhu, Xiang Li, Xi Jiang, Bao Ge, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang
This review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering, exploring the latest advancements in visual prompt engineering.
1 code implementation • 1 Jun 2023 • Hong-Yu Zhou, Yizhou Yu, Chengdi Wang, Shu Zhang, Yuanxu Gao, Jia Pan, Jun Shao, Guangming Lu, Kang Zhang, Weimin Li
During the diagnostic process, clinicians leverage multimodal information, such as chief complaints, medical images, and laboratory-test results.
no code implementations • 26 May 2023 • Qichao Wang, Huan Ma, WenTao Wei, Hangyu Li, Liang Chen, Peilin Zhao, Binwen Zhao, Bo Hu, Shu Zhang, Zibin Zheng, Bingzhe Wu
The rapid development of digital economy has led to the emergence of various black and shadow internet industries, which pose potential risks that can be identified and managed through digital risk management (DRM) that uses different techniques such as machine learning and deep learning.
no code implementations • 18 May 2023 • Can Qin, Shu Zhang, Ning Yu, Yihao Feng, Xinyi Yang, Yingbo Zhou, Huan Wang, Juan Carlos Niebles, Caiming Xiong, Silvio Savarese, Stefano Ermon, Yun Fu, ran Xu
Visual generative foundation models such as Stable Diffusion show promise in navigating these goals, especially when prompted with arbitrary languages.
1 code implementation • 14 May 2023 • Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, ran Xu, Juan Carlos Niebles, Silvio Savarese
Recent advancements in multimodal pre-training methods have shown promising efficacy in 3D representation learning by aligning multimodal features across 3D shapes, their 2D counterparts, and language descriptions.
Ranked #2 on
3D Point Cloud Classification
on ScanObjectNN
(using extra training data)
no code implementations • 28 Apr 2023 • Jiaqi Wang, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, Haixing Dai, Qiushi Yang, Yanqing Kang, Jinru Wu, Huawen Hu, Chenxi Yue, Haiyang Zhang, Yiheng Liu, Xiang Li, Bao Ge, Dajiang Zhu, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang
This review will introduce the latest advances in prompt engineering in the field of natural language processing (NLP) for the medical domain.
2 code implementations • 17 Apr 2023 • Chong Ma, Zihao Wu, Jiaqi Wang, Shaochen Xu, Yaonai Wei, Zhengliang Liu, Xi Jiang, Lei Guo, Xiaoyan Cai, Shu Zhang, Tuo Zhang, Dajiang Zhu, Dinggang Shen, Tianming Liu, Xiang Li
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians, and it is typically written by radiologists based on the 'Findings' section.
2 code implementations • 23 Mar 2023 • Huan Ma, Changqing Zhang, Yatao Bian, Lemao Liu, Zhirui Zhang, Peilin Zhao, Shu Zhang, Huazhu Fu, QinGhua Hu, Bingzhe Wu
Large language models have demonstrated surprising ability to perform in-context learning, i. e., these models can be directly applied to solve numerous downstream tasks by conditioning on a prompt constructed by a few input-output examples.
1 code implementation • ICCV 2023 • Can Qin, Ning Yu, Chen Xing, Shu Zhang, Zeyuan Chen, Stefano Ermon, Yun Fu, Caiming Xiong, ran Xu
Empirical results show that GlueNet can be trained efficiently and enables various capabilities beyond previous state-of-the-art models: 1) multilingual language models such as XLM-Roberta can be aligned with existing T2I models, allowing for the generation of high-quality images from captions beyond English; 2) GlueNet can align multi-modal encoders such as AudioCLIP with the Stable Diffusion model, enabling sound-to-image generation; 3) it can also upgrade the current text encoder of the latent diffusion model for challenging case generation.
1 code implementation • 16 Mar 2023 • Shu Zhang, Xinyi Yang, Yihao Feng, Can Qin, Chia-Chih Chen, Ning Yu, Zeyuan Chen, Huan Wang, Silvio Savarese, Stefano Ermon, Caiming Xiong, ran Xu
Incorporating human feedback has been shown to be crucial to align text generated by large language models to human preferences.
1 code implementation • 13 Mar 2023 • Arun Tejasvi Chaganty, Megan Leszczynski, Shu Zhang, Ravi Ganti, Krisztian Balog, Filip Radlinski
Users in consumption domains, like music, are often able to more efficiently provide preferences over a set of items (e. g. a playlist or radio) than over single items (e. g. songs).
no code implementations • 27 Jan 2023 • Megan Leszczynski, Ravi Ganti, Shu Zhang, Krisztian Balog, Filip Radlinski, Fernando Pereira, Arun Tejasvi Chaganty
A human evaluation shows that the conversations contain consistent utterances with relevant item sets, nearly matching the quality of small human-collected conversational data for this task.
no code implementations • 20 Oct 2022 • Zeyu Cao, Zhipeng Liang, Shu Zhang, Hangyu Li, Ouyang Wen, Yu Rong, Peilin Zhao, Bingzhe Wu
In this paper, we investigate a novel problem of building contextual bandits in the vertical federated setting, i. e., contextual information is vertically distributed over different departments.
1 code implementation • CVPR 2022 • Shu Zhang, ran Xu, Caiming Xiong, Chetan Ramaiah
Current contrastive learning frameworks focus on leveraging a single supervisory signal to learn representations, which limits the efficacy on unseen data and downstream tasks.
no code implementations • 4 Mar 2022 • Xi Chen, Jiahuan Lv, Dehua Feng, Xuanqin Mou, Ling Bai, Shu Zhang, Zhiguo Zhou
Accurately identifying patient's status through medical images plays an important role in diagnosis and treatment.
1 code implementation • 7 Feb 2022 • Danhuai Guo, Shiyin Ge, Shu Zhang, Song Gao, Ran Tao, Yangang Wang
Spatial-query-by-sketch is an intuitive tool to explore human spatial knowledge about geographic environments and to support communication with scene database queries.
1 code implementation • 5 Jan 2022 • Shu Zhang, Zihao Li, Hong-Yu Zhou, Jiechao Ma, Yizhou Yu
The difficulties in both data acquisition and annotation substantially restrict the sample sizes of training datasets for 3D medical imaging applications.
Ranked #1 on
Medical Object Detection
on DeepLesion
no code implementations • 1 Jul 2021 • Chenglin Yu, Dingnan Cui, Muheng Shang, Shu Zhang, Lei Guo, Junwei Han, Lei Du, Alzheimer's Disease Neuroimaging Initiative
Though deep learning models can extract the nonlinear relationship, they could not select relevant genetic factors.
no code implementations • 3 Jun 2021 • Hong-Yu Zhou, Chengdi Wang, Haofeng Li, Gang Wang, Shu Zhang, Weimin Li, Yizhou Yu
Semi-Supervised classification and segmentation methods have been widely investigated in medical image analysis.
1 code implementation • 21 Apr 2021 • Jie Lian, Jingyu Liu, Shu Zhang, Kai Gao, Xiaoqing Liu, Dingwen Zhang, Yizhou Yu
Leveraging on constant structure and disease relations extracted from domain knowledge, we propose a structure-aware relation network (SAR-Net) extending Mask R-CNN.
1 code implementation • 16 Dec 2020 • Shu Zhang, Jincheng Xu, Yu-Chun Chen, Jiechao Ma, Zihao Li, Yizhou Wang, Yizhou Yu
We demonstrate that with the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset (3. 48% absolute improvement in the sensitivity of FPs@0. 5), significantly surpassing the baseline method by up to 6. 06% (in MAP@0. 5) which adopts 2D convolution for 3D context modeling.
Ranked #4 on
Medical Object Detection
on DeepLesion
no code implementations • 16 Dec 2020 • Shu Zhang, Jing-Shu Li, Yang-Jie Su, Yu-Mei Zhang, Zi-Yuan Li, Zheng-Yun You
The liquid-based detectors are widely used in particle and nuclear physics experiments.
Instrumentation and Detectors
no code implementations • 2 May 2020 • Yimin Hou, Shuyue Jia, Xiangmin Lun, Shu Zhang, Tao Chen, Fang Wang, Jinglei Lv
The introduced deep feature mining approach can precisely recognize human motion intents from raw EEG signals, which paves the road to translate the EEG based MI recognition to practical BCI systems.
1 code implementation • CVPR 2020 • Hoang Le, Feng Liu, Shu Zhang, Aseem Agarwala
We then develop a multi-scale neural network and show that when properly trained using our new dataset, this neural network can already handle dynamic scenes to some extent.
no code implementations • 13 Dec 2019 • Hongwei Xv, Xin Sun, Junyu Dong, Shu Zhang, Qiong Li
Low-shot learning indicates the ability to recognize unseen objects based on very limited labeled training samples, which simulates human visual intelligence.
1 code implementation • 10 Sep 2019 • Zihao Li, Shu Zhang, Junge Zhang, Kaiqi Huang, Yizhou Wang, Yizhou Yu
In this paper, we propose to incorporate domain knowledge in clinical practice into the model design of universal lesion detectors.
Ranked #8 on
Medical Object Detection
on DeepLesion
no code implementations • 9 Jun 2019 • Benlin Hu, Cheng Lei, Dong Wang, Shu Zhang, Zhenyu Chen
Deep learning models have a large number of freeparameters that need to be calculated by effective trainingof the models on a great deal of training data to improvetheir generalization performance.
1 code implementation • CVPR 2019 • Chenyou Fan, Xiaofan Zhang, Shu Zhang, Wensheng Wang, Chi Zhang, Heng Huang
In this paper, we propose a novel end-to-end trainable Video Question Answering (VideoQA) framework with three major components: 1) a new heterogeneous memory which can effectively learn global context information from appearance and motion features; 2) a redesigned question memory which helps understand the complex semantics of question and highlights queried subjects; and 3) a new multimodal fusion layer which performs multi-step reasoning by attending to relevant visual and textual hints with self-updated attention.
Ranked #29 on
Visual Question Answering (VQA)
on MSRVTT-QA
no code implementations • 24 Apr 2017 • Shu Zhang, Hui Yu, Ting Wang, Junyu Dong, Honghai Liu
With the increasing demands of applications in virtual reality such as 3D films, virtual Human-Machine Interactions and virtual agents, the analysis of 3D human face analysis is considered to be more and more important as a fundamental step for those virtual reality tasks.
3 code implementations • ICCV 2017 • Rui Huang, Shu Zhang, Tianyu Li, Ran He
This paper proposes a Two-Pathway Generative Adversarial Network (TP-GAN) for photorealistic frontal view synthesis by simultaneously perceiving global structures and local details.
no code implementations • COLING 2016 • Hailong Cao, Tiejun Zhao, Shu Zhang, Yao Meng
We introduce a distribution based model to learn bilingual word embeddings from monolingual data.
no code implementations • 16 Nov 2016 • Shu Zhang, Ran He, Tieniu Tan
The occlusions incurred by random meshes severely degenerate the performance of face verification systems, which raises the MeshFace verification problem between MeshFace and daily photos.
no code implementations • 21 May 2016 • Shu Zhang, Qi Zhu, Amit Roy-Chowdhury
In this paper, we focus on this problem and propose a framework to adaptively select the "best" algorithm-parameter combination and the computation platform under performance and cost constraints at design time, and adapt the algorithms at runtime based on real-time inputs.