no code implementations • ACL (ECNLP) 2021 • Hang Zhang, Liling Tan
In this paper, we explored different levels of textual representations for cross-lingual information retrieval.
no code implementations • ICML 2020 • Hang Zhang, Ping Li
Unlabeled linear regression, or ``linear regression with an unknown permutation'', has attracted increasing attentions due to its applications in linkage record and de-anonymization.
no code implementations • AMTA 2022 • Hang Zhang, Liling Tan, Amita Misra
Multilingual query localization is integral to modern e-commerce.
no code implementations • 27 Jan 2025 • Hang Zhang, Qian Lou, Yanshan Wang
Large language models (LLMs) are increasingly utilized in healthcare applications.
1 code implementation • 22 Jan 2025 • Boqiang Zhang, Kehan Li, Zesen Cheng, Zhiqiang Hu, Yuqian Yuan, Guanzheng Chen, Sicong Leng, Yuming Jiang, Hang Zhang, Xin Li, Peng Jin, Wenqi Zhang, Fan Wang, Lidong Bing, Deli Zhao
The key insight of our vision-centric training paradigm is that high-quality image-text data is crucial for both image and video understanding.
1 code implementation • 1 Jan 2025 • Wenqi Zhang, Hang Zhang, Xin Li, Jiashuo Sun, Yongliang Shen, Weiming Lu, Deli Zhao, Yueting Zhuang, Lidong Bing
Compared to its counterparts, our video-centric textbook offers more coherent context, richer knowledge, and better image-text alignment.
1 code implementation • 31 Dec 2024 • Yuqian Yuan, Hang Zhang, Wentong Li, Zesen Cheng, Boqiang Zhang, Long Li, Xin Li, Deli Zhao, Wenqiao Zhang, Yueting Zhuang, Jianke Zhu, Lidong Bing
Finally, we meticulously create a VideoRefer-Bench to comprehensively assess the spatial-temporal understanding capability of a Video LLM, evaluating it across various aspects.
no code implementations • 30 Dec 2024 • Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao
Digital accessibility is a cornerstone of inclusive content delivery, yet many EPUB files fail to meet fundamental accessibility standards, particularly in providing descriptive alt text for images.
no code implementations • 28 Dec 2024 • Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao, Yixian Shen, Hang Zhang
Large Language Models (LLMs) have demonstrated significant effectiveness across various NLP tasks, including text ranking.
no code implementations • 22 Dec 2024 • Yiyi Tao, Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du
The increasing deployment of Large Language Models (LLMs) in various applications necessitates a rigorous evaluation of their robustness against adversarial attacks.
no code implementations • 15 Dec 2024 • Hang Zhang, Zhuoling Li, Jun Liu
Dynamic scenes contain intricate spatio-temporal information, crucial for mobile robots, UAVs, and autonomous driving systems to make informed decisions.
no code implementations • 4 Dec 2024 • Yijia Guo, Wenkai Huang, Yang Li, Gaolei Li, Hang Zhang, Liwen Hu, Jianhua Li, Tiejun Huang, Lei Ma
3D Gaussian splatting (3DGS) has demonstrated impressive 3D reconstruction performance with explicit scene representations.
no code implementations • 4 Dec 2024 • Runjian Chen, Hang Zhang, Avinash Ravichandran, Wenqi Shao, Alex Wong, Ping Luo
In this paper, we explore joint unsupervised pre-training for fusion 3D perception via differentiable rendering and propose CLAP, short for Curvature sampLing and swApping Prototype assignment prediction.
no code implementations • 28 Oct 2024 • Jiacheng Wang, Xiang Chen, Renjiu Hu, Rongguang Wang, Min Liu, Yaonan Wang, Jiazheng Wang, Hao Li, Hang Zhang
Co-examination of second-harmonic generation (SHG) and bright-field (BF) microscopy enables the differentiation of tissue components and collagen fibers, aiding the analysis of human breast and pancreatic cancer tissues.
1 code implementation • 22 Oct 2024 • Zesen Cheng, Hang Zhang, Kehan Li, Sicong Leng, Zhiqiang Hu, Fei Wu, Deli Zhao, Xin Li, Lidong Bing
Contrastive loss is a powerful approach for representation learning, where larger batch sizes enhance performance by providing more negative samples to better distinguish between similar and dissimilar data.
1 code implementation • 16 Oct 2024 • Sicong Leng, Yun Xing, Zesen Cheng, Yang Zhou, Hang Zhang, Xin Li, Deli Zhao, Shijian Lu, Chunyan Miao, Lidong Bing
Recent advancements in large multimodal models (LMMs) have significantly enhanced performance across diverse tasks, with ongoing efforts to further integrate additional modalities such as video and audio.
1 code implementation • 16 Oct 2024 • Yongxin Zhu, Bocheng Li, Hang Zhang, Xin Li, Linli Xu, Lidong Bing
Furthermore, we propose a simple but effective discrete image tokenizer to stabilize the latent space for image generative modeling by applying K-Means on the latent features of self-supervised learning models.
Ranked #1 on
Conditional Image Generation
on ImageNet 256x256
1 code implementation • 15 Oct 2024 • Fei Tang, Yongliang Shen, Hang Zhang, Zeqi Tan, Wenqi Zhang, Guiyang Hou, Kaitao Song, Weiming Lu, Yueting Zhuang
GaVaMoE introduces two key components: (1) a rating reconstruction module that employs Variational Autoencoder (VAE) with a Gaussian Mixture Model (GMM) to capture complex user-item collaborative preferences, serving as a pre-trained multi-gating mechanism; and (2) a set of fine-grained expert models coupled with the multi-gating mechanism for generating highly personalized explanations.
no code implementations • 7 Oct 2024 • Yuelyu Ji, Wenhe Ma, Sonish Sivarajkumar, Hang Zhang, Eugene Mathew Sadhu, Zhuochun Li, Xizhi Wu, Shyam Visweswaran, Yanshan Wang
Recent advancements in large language models have demonstrated their potential in numerous medical applications, particularly in automating clinical trial matching for translational research and enhancing medical question answering for clinical decision support.
no code implementations • 15 Sep 2024 • Yiyi Tao, Zhuoyue Wang, Hang Zhang, Lun Wang
In noise-adaptive learning, we estimate the noise probability of each image-text pair based on the transformer's memorization effect and employ noise-adaptive regularization on image-text contrastive learning to condition cross-modal alignment.
no code implementations • 14 Sep 2024 • Hang Zhang, Yang Xu, Lei Gong, Ye Zhu, Kai Ming Ting
This paper introduces a new framework for clustering in a distributed network called Distributed Clustering based on Distributional Kernel (K) or KDC that produces the final clusters based on the similarity with respect to the distributions of initial clusters, as measured by K. It is the only framework that satisfies all three of the following properties.
no code implementations • 13 Sep 2024 • Farnoosh Javadi, Phanideep Gampa, Alyssa Woo, Xingxing Geng, Hang Zhang, Jose Sepulveda, Belhassen Bayar, Fei Wang
Furthermore, our novel prompt engineering framework yields higher quality LLM-generated data to be used for weak supervision; we observed 47. 60% improvement over baseline in agreement rate between LLM predictions and human annotations with respect to F1 score, weighted according to the distribution of occurrences of the search queries.
1 code implementation • 8 Sep 2024 • Jiazheng Wang, Xiang Chen, Yuxi Zhang, Min Liu, Yaonan Wang, Hang Zhang
However, due to the differences between multimodal images and intraoperative image deformation caused by tissue displacement and removal during surgery, effective registration of preoperative and intraoperative multimodal images faces significant challenges.
no code implementations • 2 Sep 2024 • Yuxi Zhang, Xiang Chen, Jiazheng Wang, Min Liu, Yaonan Wang, Dongdong Liu, Renjiu Hu, Hang Zhang
In this paper, we summarize the methods and experimental results we proposed for Task 2 in the learn2reg 2024 Challenge.
no code implementations • 16 Jul 2024 • Zehan Wang, Ziang Zhang, Hang Zhang, Luping Liu, Rongjie Huang, Xize Cheng, Hengshuang Zhao, Zhou Zhao
Given the foundational role of multimodal joint representation in understanding and generation pipelines, high-quality omni joint representations would be a step toward co-processing more diverse multimodal information.
1 code implementation • 10 Jul 2024 • Hang Zhang, Xiang Chen, Renjiu Hu, Dongdong Liu, Gaolei Li, Rongguang Wang
In this paper, we address this issue with MemWarp, a learning framework that leverages a memory network to store prototypical information tailored to different anatomical regions.
4 code implementations • 11 Jun 2024 • Zesen Cheng, Sicong Leng, Hang Zhang, Yifei Xin, Xin Li, Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, Lidong Bing
In this paper, we present the VideoLLaMA 2, a set of Video Large Language Models (Video-LLMs) designed to enhance spatial-temporal modeling and audio understanding in video and audio-oriented tasks.
Ranked #3 on
Video Question Answering
on Perception Test
2 code implementations • 27 May 2024 • Xianfu Cheng, Hang Zhang, Jian Yang, Xiang Li, Weixiao Zhou, Fei Liu, Kui Wu, Xiangyuan Guan, Tao Sun, Xianjie Wu, Tongliang Li, Zhoujun Li
In the domain of Document AI, parsing semi-structured image form is a crucial Key Information Extraction (KIE) task.
no code implementations • 26 May 2024 • Chao Li, Jinwei Zhang, Hang Zhang, Jiahao Li, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
Purpose: To develop a pipeline for motion artifact correction in mGRE and quantitative susceptibility mapping (QSM).
no code implementations • 15 Apr 2024 • Jie zhou, Xin Chen, Hang Zhang, Zhe Li
Building on these results, we detail the automatic construction process of case knowledge graphs for judicial cases, enabling the assembly of knowledge graphs for hundreds of thousands of judgments.
no code implementations • 19 Mar 2024 • Mingqi Shao, Feng Xiong, Hang Zhang, Shuang Yang, Mu Xu, Wei Bian, Xueqian Wang
The global stage obtains a continuous representation of the entire scene while the focal stage decomposes the scene into multiple blocks and further processes them with distinct sub-encoders.
no code implementations • 16 Mar 2024 • Yang Cao, Haolong Xiang, Hang Zhang, Ye Zhu, Kai Ming Ting
Anomaly detection is a longstanding and active research area that has many applications in domains such as finance, security, and manufacturing.
no code implementations • 15 Mar 2024 • Hang Zhang, Wenxiao Zhang, Haoxuan Qu, Jun Liu
Human-centered dynamic scene understanding plays a pivotal role in enhancing the capability of robotic and autonomous systems, in which Video-based Human-Object Interaction (V-HOI) detection is a crucial task in semantic scene understanding, aimed at comprehensively understanding HOI relationships within a video to benefit the behavioral decisions of mobile robots and autonomous driving systems.
1 code implementation • 18 Jan 2024 • Xianfu Cheng, Weixiao Zhou, Xiang Li, Jian Yang, Hang Zhang, Tao Sun, Wei zhang, Yuying Mai, Tongliang Li, Xiaoming Chen, Zhoujun Li
In this work, we propose a VIsion Permutable extractor for fast and efficient Scene Text Recognition (SVIPTR), which achieves an impressive balance between high performance and rapid inference speeds in the domain of STR.
no code implementations • 18 Jan 2024 • Hang Zhang, Xiang Chen, Rongguang Wang, Renjiu Hu, Dongdong Liu, Gaolei Li
In medical imaging, scans often reveal objects with varied contrasts but consistent internal intensities or textures.
no code implementations • CVPR 2024 • Hang Zhang, Anton Savov, Benjamin Dillenburger
Layout planning spanning from architecture to interior design is a slow iterative exploration of ill-defined problems adopting a "I'll know it when I see it" approach to potential solutions.
no code implementations • 28 Dec 2023 • Hang Zhang, Thanh D. Nguyen, Jinwei Zhang, Renjiu Hu, Susan A. Gauthier, Yi Wang
We validated RimSet using simulated QSM images and an in vivo dataset of 172 MS subjects with 177 rim+ and 3986 rim-lesions.
1 code implementation • 1 Dec 2023 • Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Zhiqiang Hu, Chenhui Shen, Yew Ken Chia, Xingxuan Li, Jianyu Wang, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen yang, Chaoqun Liu, Hang Zhang, Lidong Bing
Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages.
6 code implementations • CVPR 2024 • Sicong Leng, Hang Zhang, Guanzheng Chen, Xin Li, Shijian Lu, Chunyan Miao, Lidong Bing
Large Vision-Language Models (LVLMs) have advanced considerably, intertwining visual recognition and language understanding to generate content that is not only coherent but also contextually attuned.
1 code implementation • 27 Nov 2023 • Xiang Chen, Min Liu, Rongguang Wang, Renjiu Hu, Dongdong Liu, Gaolei Li, Hang Zhang
Medical images are often characterized by their structured anatomical representations and spatially inhomogeneous contrasts.
Ranked #2 on
Image Registration
on Unpaired-abdomen-CT
(using extra training data)
no code implementations • 7 Nov 2023 • Hang Zhang, Yeyun Gong, Xingwei He, Dayiheng Liu, Daya Guo, Jiancheng Lv, Jian Guo
Most dense retrieval models contain an implicit assumption: the training query-document pairs are exactly matched.
1 code implementation • 2 Nov 2023 • Hang Zhang
In this research, I proposed a network structure for multi-view 3D object detection using camera-only data and a Bird's-Eye-View map.
no code implementations • 31 Oct 2023 • Hang Zhang, Ping Li
By linking this equation to the branching random walk process, we are able to characterize the impact of the signal-to-noise-ratio ($\snr$) on the permutation recovery.
no code implementations • 2 Oct 2023 • Hang Zhang, Ping Li
This paper considers the task of linear regression with shuffled labels, i. e., $\mathbf Y = \mathbf \Pi \mathbf X \mathbf B + \mathbf W$, where $\mathbf Y \in \mathbb R^{n\times m}, \mathbf Pi \in \mathbb R^{n\times n}, \mathbf X\in \mathbb R^{n\times p}, \mathbf B \in \mathbb R^{p\times m}$, and $\mathbf W\in \mathbb R^{n\times m}$, respectively, represent the sensing results, (unknown or missing) corresponding information, sensing matrix, signal of interest, and additive sensing noise.
1 code implementation • 5 Jun 2023 • Hang Zhang, Renjiu Hu, Xiang Chen, Rongguang Wang, Jinwei Zhang, Jiahao Li
Specifically, the network incorporating DAGrid has realized a 70. 8% reduction in network parameter size and a 96. 8% decrease in FLOPs, while concurrently improving the Dice score for skin lesion segmentation by 1. 0% compared to state-of-the-art transformers.
4 code implementations • 5 Jun 2023 • Hang Zhang, Xin Li, Lidong Bing
We present Video-LLaMA a multi-modal framework that empowers Large Language Models (LLMs) with the capability of understanding both visual and auditory content in the video.
Ranked #1 on
Video-Text Retrieval
on Test-of-Time
(using extra training data)
no code implementations • 5 May 2023 • Jinwei Zhang, Alexey Dimov, Chao Li, Hang Zhang, Thanh D. Nguyen, Pascal Spincemaille, Yi Wang
Purpose: To improve the generalization ability of convolutional neural network (CNN) based prediction of quantitative susceptibility mapping (QSM) from high-pass filtered phase (HPFP) image.
no code implementations • 7 Apr 2023 • Jinwei Zhang, Thanh D. Nguyen, Eddy Solomon, Chao Li, Qihao Zhang, Jiahao Li, Hang Zhang, Pascal Spincemaille, Yi Wang
Results: The retrospective ablation study showed improved image sharpness of mcLARO compared to the baseline network without multi-contrast sampling pattern optimization or image feature fusion, and negligible bias and narrow 95% limits of agreement on regional T1, T2, T2* and QSM values were obtained by the under-sampled reconstructions compared to the fully sampled reconstruction.
2 code implementations • 29 Mar 2023 • Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen
Many natural language processing (NLP) tasks rely on labeled data to train machine learning models with high performance.
no code implementations • 20 Mar 2023 • Hang Zhang, Ping Li
From the statistical aspect, we first establish the minimax lower bounds on the sample number $n$ and the \emph{signal-to-noise ratio} ($\snr$) for the correct recovery of permutation matrix $\bPitrue$ and the support set $\supp(\bbetatrue)$, to be more specific, $n \gtrsim k\log p$ and $\log\snr \gtrsim \log n + \frac{k\log p}{n}$.
1 code implementation • 15 Mar 2023 • Hang Zhang, Rongguang Wang, Renjiu Hu, Jinwei Zhang, Jiahao Li
Chronic active multiple sclerosis lesions, also termed as rim+ lesions, can be characterized by a hyperintense rim at the edge of the lesion on quantitative susceptibility maps.
no code implementations • 19 Jan 2023 • Hang Zhang, Rongguang Wang, Jinwei Zhang, Dongdong Liu, Chao Li, Jiahao Li
Compared to natural images, medical images usually show stronger visual patterns and therefore this adds flexibility and elasticity to resource-limited clinical applications by injecting proper priors into neural networks.
1 code implementation • 18 Dec 2022 • Xingwei He, Yeyun Gong, A-Long Jin, Hang Zhang, Anlei Dong, Jian Jiao, Siu Ming Yiu, Nan Duan
The dual-encoder has become the de facto architecture for dense retrieval.
2 code implementations • 14 Dec 2022 • Jiashuo Sun, Hang Zhang, Chen Lin, Xiangdong Su, Yeyun Gong, Jian Guo
For the retriever, we adopt a number-aware negative sampling strategy to enable the retriever to be more discriminative on key numerical facts.
Ranked #1 on
Conversational Question Answering
on ConvFinQA
no code implementations • 9 Dec 2022 • Manas Gupta, Sarthak Ketanbhai Modi, Hang Zhang, Joon Hei Lee, Joo Hwee Lim
Four of the five Bio-algorithms tested outperform BP by upto 5% accuracy when only 20% of the training dataset is available.
1 code implementation • 1 Nov 2022 • Jinwei Zhang, Pascal Spincemaille, Hang Zhang, Thanh D. Nguyen, Chao Li, Jiahao Li, Ilhami Kovanlikaya, Mert R. Sabuncu, Yi Wang
In this paper, we present our new framework, called Learned Acquisition and Reconstruction Optimization (LARO), which aims to accelerate the multi-echo gradient echo (mGRE) pulse sequence for QSM.
no code implementations • 21 Oct 2022 • Xingwei He, Yeyun Gong, A-Long Jin, Weizhen Qi, Hang Zhang, Jian Jiao, Bartuer Zhou, Biao Cheng, SM Yiu, Nan Duan
Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities.
1 code implementation • 18 Oct 2022 • Shuai Fan, Chen Lin, Haonan Li, Zhenghao Lin, Jinsong Su, Hang Zhang, Yeyun Gong, Jian Guo, Nan Duan
Most existing pre-trained language representation models (PLMs) are sub-optimal in sentiment analysis tasks, as they capture the sentiment information from word-level while under-considering sentence-level information.
1 code implementation • CVPR 2023 • Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu
To address this, we propose to finetune CLIP on a collection of masked image regions and their corresponding text descriptions.
Ranked #4 on
Semantic Segmentation
on Replica
1 code implementation • 27 Sep 2022 • Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan
It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student.
no code implementations • 9 Jul 2022 • Liren Yang, Hang Zhang, Jean-Baptiste Jeannin, Necmiye Ozay
This Minkowski difference needs to be represented as a constrained zonotope to enable subsequent computation, but, as we show, it is impossible to find a polynomial-sized representation for it in polynomial time.
no code implementations • 6 May 2022 • Hang Zhang, Afshin Abdi, Faramarz Fekri
For the first time, we show that the correct graphical structure can be correctly recovered under the indefinite sensing system ($d < p$) using insufficient samples ($n < p$).
no code implementations • 11 Apr 2022 • Hang Zhang, Afshin Abdi, Faramarz Fekri
This paper proposes a general framework to design a sparse sensing matrix $\ensuremath{\mathbf{A}}\in \mathbb{R}^{m\times n}$, in a linear measurement system $\ensuremath{\mathbf{y}} = \ensuremath{\mathbf{Ax}}^{\natural} + \ensuremath{\mathbf{w}}$, where $\ensuremath{\mathbf{y}} \in \mathbb{R}^m$, $\ensuremath{\mathbf{x}}^{\natural}\in \RR^n$, and $\ensuremath{\mathbf{w}}$ denote the measurements, the signal with certain structures, and the measurement noise, respectively.
no code implementations • 17 Mar 2022 • Muralikrishnna G. Sethuraman, Hang Zhang, Faramarz Fekri
In this paper, we propose a general framework for designing sensing matrix $\boldsymbol{A} \in \mathbb{R}^{d\times p}$, for estimation of sparse covariance matrix from compressed measurements of the form $\boldsymbol{y} = \boldsymbol{A}\boldsymbol{x} + \boldsymbol{n}$, where $\boldsymbol{y}, \boldsymbol{n} \in \mathbb{R}^d$, and $\boldsymbol{x} \in \mathbb{R}^p$.
no code implementations • 8 Mar 2022 • Xiaoyan Qiu, Hang Zhang, Yiwei Qiu, Buxiang Zhou, Tianlei Zang, Ruomei Qi, Jin Lin, Jiepeng Wang
When directly coupled with fluctuating energy sources such as wind and photovoltage power, the alkaline electrolysis (AEL) in a power-to-hydrogen (P2H) system is required to operate flexibly by dynamically adjusting its hydrogen production rate.
no code implementations • 16 Feb 2022 • Hang Zhang, Su Yang, Hongyong Wang, zhongyan lu, helin sun
Few researches have studied simultaneous detection of smoke and flame accompanying fires due to their different physical natures that lead to uncertain fluid patterns.
1 code implementation • 26 Jan 2022 • Xiaonan Li, Yeyun Gong, Yelong Shen, Xipeng Qiu, Hang Zhang, Bolun Yao, Weizhen Qi, Daxin Jiang, Weizhu Chen, Nan Duan
For bimodal contrastive learning, we leverage the documentation and in-line comments of code to build code-text pairs.
no code implementations • 19 Nov 2021 • Bichen Wu, Chaojian Li, Hang Zhang, Xiaoliang Dai, Peizhao Zhang, Matthew Yu, Jialiang Wang, Yingyan Lin, Peter Vajda
To tackle these challenges, we propose FBNetV5, a NAS framework that can search for neural architectures for a variety of vision tasks with much reduced computational cost and human effort.
Ranked #7 on
Neural Architecture Search
on ImageNet
1 code implementation • ICLR 2022 • Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen
To address these challenges, we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.
no code implementations • 29 Sep 2021 • Chaojian Li, KyungMin Kim, Bichen Wu, Peizhao Zhang, Hang Zhang, Xiaoliang Dai, Peter Vajda, Yingyan Lin
In particular, when transferred to PiT, our scaling strategies lead to a boosted ImageNet top-1 accuracy of from $74. 6\%$ to $76. 7\%$ ($\uparrow2. 1\%$) under the same 0. 7G FLOPs; and when transferred to the COCO object detection task, the average precision is boosted by $\uparrow0. 7\%$ under a similar throughput on a V100 GPU.
no code implementations • 10 May 2021 • Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen
We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA.
1 code implementation • 6 May 2021 • Zizhen Zhang, Zhiyuan Wu, Hang Zhang, Jiahai Wang
When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and efficiently deal with multiple subproblems determined by weight decomposition of objectives.
no code implementations • 4 May 2021 • Chao Li, Hang Zhang, Jinwei Zhang, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
An approach to reduce motion artifacts in Quantitative Susceptibility Mapping using deep learning is proposed.
no code implementations • 10 Mar 2021 • Jinwei Zhang, Hang Zhang, Chao Li, Pascal Spincemaille, Mert Sabuncu, Thanh D. Nguyen, Yi Wang
Quantitative imaging in MRI usually involves acquisition and reconstruction of a series of images at multi-echo time points, which possibly requires more scan time and specific reconstruction technique compared to conventional qualitative imaging.
1 code implementation • 6 Mar 2021 • Hang Zhang, Rongguang Wang, Jinwei Zhang, Chao Li, Gufeng Yang, Pascal Spincemaille, Thanh Nguyen, Yi Wang
We introduce Neural Representation of Distribution (NeRD) technique, a module for convolutional neural networks (CNNs) that can estimate the feature distribution by optimizing an underlying function mapping image coordinates to the feature distribution.
1 code implementation • Findings (ACL) 2021 • Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan
Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP).
no code implementations • 16 Oct 2020 • Tianyu Ma, Hang Zhang, Hanley Ong, Amar Vora, Thanh D. Nguyen, Ajay Gupta, Yi Wang, Mert Sabuncu
Our core idea is straightforward: A diverse ensemble of low precision and high recall models are likely to make different false positive errors (classifying background as foreground in different parts of the image), but the true positives will tend to be consistent.
no code implementations • 29 Sep 2020 • Hang Zhang, Jinwei Zhang, Rongguang Wang, Qihao Zhang, Susan A. Gauthier, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
Multiple sclerosis (MS) lesions occupy a small fraction of the brain volume, and are heterogeneous with regards to shape, size and locations, which poses a great challenge for training deep learning based segmentation models.
1 code implementation • 22 Sep 2020 • Jia Xue, Hang Zhang, Ko Nishino, Kristin J. Dana
A key concept is differential angular imaging, where small angular variations in image capture enables angular-gradient features for an enhanced appearance representation that improves recognition.
no code implementations • 13 Sep 2020 • Hang Zhang, Jinwei Zhang, Rongguang Wang, Qihao Zhang, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
Recently, 3D medical image reconstruction (MIR) and segmentation (MIS) based on deep neural networks have been developed with promising results, and attention mechanism has been further designed to capture global contextual information for performance enhancement.
no code implementations • 7 Sep 2020 • Jinwei Zhang, Hang Zhang, Mert Sabuncu, Pascal Spincemaille, Thanh Nguyen, Yi Wang
A learning-based posterior distribution estimation method, Probabilistic Dipole Inversion (PDI), is proposed to solve the quantitative susceptibility mapping (QSM) inverse problem in MRI with uncertainty estimation.
no code implementations • 28 Jul 2020 • Jinwei Zhang, Hang Zhang, Alan Wang, Qihao Zhang, Mert Sabuncu, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang
The previously established LOUPE (Learning-based Optimization of the Under-sampling Pattern) framework for optimizing the k-space sampling pattern in MRI was extended in three folds: firstly, fully sampled multi-coil k-space data from the scanner, rather than simulated k-space data from magnitude MR images in LOUPE, was retrospectively under-sampled to optimize the under-sampling pattern of in-vivo k-space data; secondly, binary stochastic k-space sampling, rather than approximate stochastic k-space sampling of LOUPE during training, was applied together with a straight-through (ST) estimator to estimate the gradient of the threshold operation in a neural network; thirdly, modified unrolled optimization network, rather than modified U-Net in LOUPE, was used as the reconstruction network in order to reconstruct multi-coil data properly and reduce the dependency on training data.
no code implementations • 9 May 2020 • Shijie Geng, Ji Zhang, Zuohui Fu, Peng Gao, Hang Zhang, Gerard de Melo
Without identifying the connection between appearing people and character names, a model is not able to obtain a genuine understanding of the plots.
no code implementations • 30 Apr 2020 • Yi Zhu, Zhongyue Zhang, Chongruo wu, Zhi Zhang, Tong He, Hang Zhang, R. Manmatha, Mu Li, Alexander Smola
In the case of semantic segmentation, this means that large amounts of pixelwise annotations are required to learn accurate models.
no code implementations • ACL 2020 • Hang Zhang, Dayiheng Liu, Jiancheng Lv, Cheng Luo
To our knowledge, this is the first attempt to generate punchlines with knowledge enhanced model.
35 code implementations • 19 Apr 2020 • Hang Zhang, Chongruo wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, Alexander Smola
It is well known that featuremap attention and multi-path representation are important for visual recognition.
Ranked #10 on
Instance Segmentation
on COCO test-dev
(APM metric)
1 code implementation • 5 Apr 2020 • Tongxin Hu, Vasileios Iosifidis, Wentong Liao, Hang Zhang, Michael YingYang, Eirini Ntoutsi, Bodo Rosenhahn
In this paper, we propose FairNN a neural network that performs joint feature representation and classification for fairness-aware learning.
7 code implementations • 13 Mar 2020 • Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, Alexander Smola
We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file.
Ranked #8 on
Molecular Property Prediction
on Tox21
1 code implementation • 27 Feb 2020 • Hang Zhang, Jinwei Zhang, Qihao Zhang, Jeremy Kim, Shun Zhang, Susan A. Gauthier, Pascal Spincemaille, Thanh D. Nguyen, Mert R. Sabuncu, Yi Wang
Brain lesion volume measured on T2 weighted MRI images is a clinically important disease marker in multiple sclerosis (MS).
no code implementations • MIDL 2019 • Zhuo Kuang, Xianbo Deng, Li Yu, Hang Zhang, Xian lin, Hui Ma
Guiding by the morphological features of the skull, a skeleton-based region proposal method is proposed to make candidate boxes more concentrated in key regions and reduce invalid boxes.
no code implementations • 5 Sep 2019 • Hang Zhang, Martin Slawski, Ping Li
For the case in which both the signal and permutation are unknown, the problem is reformulated as a bi-convex optimization problem with an auxiliary variable, which can be solved by the Alternating Direction Method of Multipliers (ADMM).
no code implementations • 18 Jul 2019 • Tingting Zhao, Hang Zhang, Jacob Spoelstra
We worked with Nestle SHIELD (Skin Health, Innovation, Education, and Longevity Development, NSH) to develop a deep learning model that is able to assess acne severity from selfie images as accurate as dermatologists.
no code implementations • 16 Jul 2019 • Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas
We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering.
3 code implementations • 9 Jul 2019 • Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu
We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).
2 code implementations • CVPR 2019 • Hang Zhang, Han Zhang, Chenguang Wang, Junyuan Xie
To leverage the semantic context in the co-occurrent features, we build an Aggregated Co-occurrent Feature (ACF) Module by aggregating the probability of the co-occurrent feature with the co-occurrent context.
Ranked #34 on
Semantic Segmentation
on PASCAL Context
2 code implementations • 26 Apr 2019 • Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li
One difficulty we observe is that the noise in the stochastic momentum estimation is accumulated over time and will have delayed effects when the batch size changes.
3 code implementations • 11 Feb 2019 • Zhi Zhang, Tong He, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li
Training heuristics greatly improve various image classification model accuracies~\cite{he2018bag}.
2 code implementations • 4 Feb 2019 • Junhao Li, Hang Zhang
We present Blaze, a C++ library that makes it easy to develop high performance parallel programs for such compute intensive tasks.
27 code implementations • CVPR 2019 • Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li
Much of the recent progress made in image classification research can be credited to training procedure refinements, such as changes in data augmentations and optimization methods.
Ranked #38 on
Domain Generalization
on VizWiz-Classification
1 code implementation • CVPR 2018 • Jia Xue, Hang Zhang, Kristin Dana
The GTOS database (comprised of over 30, 000 images of 40 classes of ground terrain in outdoor scenes) enables supervised recognition.
12 code implementations • CVPR 2018 • Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal
In this paper, we explore the impact of global contextual information in semantic segmentation by introducing the Context Encoding Module, which captures the semantic context of scenes and selectively highlights class-dependent featuremaps.
Ranked #7 on
Semantic Segmentation
on PASCAL VOC 2012 test
no code implementations • 14 Jun 2017 • Parneet Kaur, Hang Zhang, Kristin J. Dana
We address the challenging problem of transferring face texture from a style face image to a content face image in a photorealistic manner without changing the identity of the original content image.
6 code implementations • 20 Mar 2017 • Hang Zhang, Kristin Dana
Despite the rapid progress in style transfer, existing approaches using feed-forward generative network for multi-style or arbitrary-style transfer are usually compromised of image quality and model flexibility.
12 code implementations • CVPR 2017 • Hang Zhang, Jia Xue, Kristin Dana
The representation is orderless and therefore is particularly useful for material and texture recognition.
no code implementations • CVPR 2017 • Jia Xue, Hang Zhang, Kristin Dana, Ko Nishino
We realize this by developing a framework for differential angular imaging, where small angular variations in image capture provide an enhanced appearance representation and significant recognition improvement.
no code implementations • 15 Nov 2016 • Hang Zhang, Fengyuan Zhu, Shixin Li
However, in real-world applications, it is common to see the training data contaminated by noises, which can affect the robustness of these matrix regression methods.
no code implementations • 25 Mar 2016 • Hang Zhang, Kristin Dana, Ko Nishino
In this work, we address the question of what reflectance can reveal about materials in an efficient manner.
no code implementations • CVPR 2015 • Hang Zhang, Kristin Dana, Ko Nishino
Reflectance offers a unique signature of the material but is challenging to measure and use for recognizing materials due to its high-dimensionality.