Search Results for author: Hang Zhang

Found 108 papers, 46 papers with code

Optimal Estimator for Unlabeled Linear Regression

no code implementations ICML 2020 Hang Zhang, Ping Li

Unlabeled linear regression, or ``linear regression with an unknown permutation'', has attracted increasing attentions due to its applications in linkage record and de-anonymization.

regression

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

1 code implementation1 Jan 2025 Wenqi Zhang, Hang Zhang, Xin Li, Jiashuo Sun, Yongliang Shen, Weiming Lu, Deli Zhao, Yueting Zhuang, Lidong Bing

Compared to its counterparts, our video-centric textbook offers more coherent context, richer knowledge, and better image-text alignment.

Optical Character Recognition (OCR)

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

1 code implementation31 Dec 2024 Yuqian Yuan, Hang Zhang, Wentong Li, Zesen Cheng, Boqiang Zhang, Long Li, Xin Li, Deli Zhao, Wenqiao Zhang, Yueting Zhuang, Jianke Zhu, Lidong Bing

Finally, we meticulously create a VideoRefer-Bench to comprehensively assess the spatial-temporal understanding capability of a Video LLM, evaluating it across various aspects.

Object Video Understanding

AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility

no code implementations30 Dec 2024 Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao

Digital accessibility is a cornerstone of inclusive content delivery, yet many EPUB files fail to meet fundamental accessibility standards, particularly in providing descriptive alt text for images.

Descriptive Text Generation

Comparative Analysis of Listwise Reranking with Large Language Models in Limited-Resource Language Contexts

no code implementations28 Dec 2024 Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du, Yiyi Tao, Yixian Shen, Hang Zhang

Large Language Models (LLMs) have demonstrated significant effectiveness across various NLP tasks, including text ranking.

Robustness of Large Language Models Against Adversarial Attacks

no code implementations22 Dec 2024 Yiyi Tao, Yixian Shen, Hang Zhang, Yanxin Shen, Lun Wang, Chuanqi Shi, Shaoshuai Du

The increasing deployment of Large Language Models (LLMs) in various applications necessitates a rigorous evaluation of their robustness against adversarial attacks.

Sentiment Analysis Sentiment Classification +1

SceneLLM: Implicit Language Reasoning in LLM for Dynamic Scene Graph Generation

no code implementations15 Dec 2024 Hang Zhang, Zhuoling Li, Jun Liu

Dynamic scenes contain intricate spatio-temporal information, crucial for mobile robots, UAVs, and autonomous driving systems to make informed decisions.

Autonomous Driving Graph Generation +1

Splats in Splats: Embedding Invisible 3D Watermark within Gaussian Splatting

no code implementations4 Dec 2024 Yijia Guo, Wenkai Huang, Yang Li, Gaolei Li, Hang Zhang, Liwen Hu, Jianhua Li, Tiejun Huang, Lei Ma

3D Gaussian splatting (3DGS) has demonstrated impressive 3D reconstruction performance with explicit scene representations.

3DGS 3D Reconstruction

CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning

no code implementations4 Dec 2024 Runjian Chen, Hang Zhang, Avinash Ravichandran, Wenqi Shao, Alex Wong, Ping Luo

In this paper, we explore joint unsupervised pre-training for fusion 3D perception via differentiable rendering and propose CLAP, short for Curvature sampLing and swApping Prototype assignment prediction.

Representation Learning Unsupervised Pre-training

Fidelity-Imposed Displacement Editing for the Learn2Reg 2024 SHG-BF Challenge

no code implementations28 Oct 2024 Jiacheng Wang, Xiang Chen, Renjiu Hu, Rongguang Wang, Min Liu, Yaonan Wang, Jiazheng Wang, Hao Li, Hang Zhang

Co-examination of second-harmonic generation (SHG) and bright-field (BF) microscopy enables the differentiation of tissue components and collagen fibers, aiding the analysis of human breast and pancreatic cancer tissues.

Contrastive Learning

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

1 code implementation22 Oct 2024 Zesen Cheng, Hang Zhang, Kehan Li, Sicong Leng, Zhiqiang Hu, Fei Wu, Deli Zhao, Xin Li, Lidong Bing

Contrastive loss is a powerful approach for representation learning, where larger batch sizes enhance performance by providing more negative samples to better distinguish between similar and dissimilar data.

Representation Learning

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

1 code implementation16 Oct 2024 Sicong Leng, Yun Xing, Zesen Cheng, Yang Zhou, Hang Zhang, Xin Li, Deli Zhao, Shijian Lu, Chunyan Miao, Lidong Bing

Recent advancements in large multimodal models (LMMs) have significantly enhanced performance across diverse tasks, with ongoing efforts to further integrate additional modalities such as video and audio.

Hallucination

Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective

1 code implementation16 Oct 2024 Yongxin Zhu, Bocheng Li, Hang Zhang, Xin Li, Linli Xu, Lidong Bing

Furthermore, we propose a simple but effective discrete image tokenizer to stabilize the latent space for image generative modeling by applying K-Means on the latent features of self-supervised learning models.

Conditional Image Generation Linear-Probe Classification +3

GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation

1 code implementation15 Oct 2024 Fei Tang, Yongliang Shen, Hang Zhang, Zeqi Tan, Wenqi Zhang, Guiyang Hou, Kaitao Song, Weiming Lu, Yueting Zhuang

GaVaMoE introduces two key components: (1) a rating reconstruction module that employs Variational Autoencoder (VAE) with a Gaussian Mixture Model (GMM) to capture complex user-item collaborative preferences, serving as a pre-trained multi-gating mechanism; and (2) a set of fine-grained expert models coupled with the multi-gating mechanism for generating highly personalized explanations.

Explainable Recommendation Language Modelling +1

Mitigating the Risk of Health Inequity Exacerbated by Large Language Models

no code implementations7 Oct 2024 Yuelyu Ji, Wenhe Ma, Sonish Sivarajkumar, Hang Zhang, Eugene Mathew Sadhu, Zhuochun Li, Xizhi Wu, Shyam Visweswaran, Yanshan Wang

Recent advancements in large language models have demonstrated their potential in numerous medical applications, particularly in automating clinical trial matching for translational research and enhancing medical question answering for clinical decision support.

Bias Detection Question Answering

NEVLP: Noise-Robust Framework for Efficient Vision-Language Pre-training

no code implementations15 Sep 2024 Yiyi Tao, Zhuoyue Wang, Hang Zhang, Lun Wang

In noise-adaptive learning, we estimate the noise probability of each image-text pair based on the transformer's memorization effect and employ noise-adaptive regularization on image-text contrastive learning to condition cross-modal alignment.

Contrastive Learning cross-modal alignment +11

Distributed Clustering based on Distributional Kernel

no code implementations14 Sep 2024 Hang Zhang, Yang Xu, Lei Gong, Ye Zhu, Kai Ming Ting

This paper introduces a new framework for clustering in a distributed network called Distributed Clustering based on Distributional Kernel (K) or KDC that produces the final clusters based on the similarity with respect to the distributions of initial clusters, as measured by K. It is the only framework that satisfies all three of the following properties.

Clustering

LLM-based Weak Supervision Framework for Query Intent Classification in Video Search

no code implementations13 Sep 2024 Farnoosh Javadi, Phanideep Gampa, Alyssa Woo, Xingxing Geng, Hang Zhang, Jose Sepulveda, Belhassen Bayar, Fei Wang

Furthermore, our novel prompt engineering framework yields higher quality LLM-generated data to be used for weak supervision; we observed 47. 60% improvement over baseline in agreement rate between LLM predictions and human annotations with respect to F1 score, weighted according to the distribution of occurrences of the search queries.

In-Context Learning intent-classification +3

Unsupervised Multimodal 3D Medical Image Registration with Multilevel Correlation Balanced Optimization

1 code implementation8 Sep 2024 Jiazheng Wang, Xiang Chen, Yuxi Zhang, Min Liu, Yaonan Wang, Hang Zhang

However, due to the differences between multimodal images and intraoperative image deformation caused by tissue displacement and removal during surgery, effective registration of preoperative and intraoperative multimodal images faces significant challenges.

Image Registration Medical Image Registration

Large Scale Unsupervised Brain MRI Image Registration Solution for Learn2Reg 2024

no code implementations2 Sep 2024 Yuxi Zhang, Xiang Chen, Jiazheng Wang, Min Liu, Yaonan Wang, Dongdong Liu, Renjiu Hu, Hang Zhang

In this paper, we summarize the methods and experimental results we proposed for Task 2 in the learn2reg 2024 Challenge.

Image Registration Task 2

OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces

no code implementations16 Jul 2024 Zehan Wang, Ziang Zhang, Hang Zhang, Luping Liu, Rongjie Huang, Xize Cheng, Hengshuang Zhao, Zhou Zhao

Given the foundational role of multimodal joint representation in understanding and generation pipelines, high-quality omni joint representations would be a step toward co-processing more diverse multimodal information.

MemWarp: Discontinuity-Preserving Cardiac Registration with Memorized Anatomical Filters

1 code implementation10 Jul 2024 Hang Zhang, Xiang Chen, Renjiu Hu, Dongdong Liu, Gaolei Li, Rongguang Wang

In this paper, we address this issue with MemWarp, a learning framework that leverages a memory network to store prototypical information tailored to different anatomical regions.

Image Registration

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

4 code implementations11 Jun 2024 Zesen Cheng, Sicong Leng, Hang Zhang, Yifei Xin, Xin Li, Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, Lidong Bing

In this paper, we present the VideoLLaMA 2, a set of Video Large Language Models (Video-LLMs) designed to enhance spatial-temporal modeling and audio understanding in video and audio-oriented tasks.

Multiple-choice Temporal Relation Extraction +3

Automatic Knowledge Graph Construction for Judicial Cases

no code implementations15 Apr 2024 Jie zhou, Xin Chen, Hang Zhang, Zhe Li

Building on these results, we detail the automatic construction process of case knowledge graphs for judicial cases, enabling the assembly of knowledge graphs for hundreds of thousands of judgments.

graph construction Knowledge Graphs

Global-guided Focal Neural Radiance Field for Large-scale Scene Rendering

no code implementations19 Mar 2024 Mingqi Shao, Feng Xiong, Hang Zhang, Shuang Yang, Mu Xu, Wei Bian, Xueqian Wang

The global stage obtains a continuous representation of the entire scene while the focal stage decomposes the scene into multiple blocks and further processes them with distinct sub-encoders.

NeRF

Anomaly Detection Based on Isolation Mechanisms: A Survey

no code implementations16 Mar 2024 Yang Cao, Haolong Xiang, Hang Zhang, Ye Zhu, Kai Ming Ting

Anomaly detection is a longstanding and active research area that has many applications in domains such as finance, security, and manufacturing.

Survey Unsupervised Anomaly Detection

Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning

no code implementations15 Mar 2024 Hang Zhang, Wenxiao Zhang, Haoxuan Qu, Jun Liu

Human-centered dynamic scene understanding plays a pivotal role in enhancing the capability of robotic and autonomous systems, in which Video-based Human-Object Interaction (V-HOI) detection is a crucial task in semantic scene understanding, aimed at comprehensively understanding HOI relationships within a video to benefit the behavioral decisions of mobile robots and autonomous driving systems.

Autonomous Driving Human-Object Interaction Detection +2

SVIPTR: Fast and Efficient Scene Text Recognition with Vision Permutable Extractor

1 code implementation18 Jan 2024 Xianfu Cheng, Weixiao Zhou, Xiang Li, Jian Yang, Hang Zhang, Tao Sun, Wei zhang, Yuying Mai, Tongliang Li, Xiaoming Chen, Zhoujun Li

In this work, we propose a VIsion Permutable extractor for fast and efficient Scene Text Recognition (SVIPTR), which achieves an impressive balance between high performance and rapid inference speeds in the domain of STR.

Decoder Scene Text Recognition

Slicer Networks

no code implementations18 Jan 2024 Hang Zhang, Xiang Chen, Rongguang Wang, Renjiu Hu, Dongdong Liu, Gaolei Li

In medical imaging, scans often reveal objects with varied contrasts but consistent internal intensities or textures.

Image Registration Lesion Segmentation +2

MaskPLAN: Masked Generative Layout Planning from Partial Input

no code implementations CVPR 2024 Hang Zhang, Anton Savov, Benjamin Dillenburger

Layout planning spanning from architecture to interior design is a slow iterative exploration of ill-defined problems adopting a "I'll know it when I see it" approach to potential solutions.

Attribute Design Synthesis

SeaLLMs -- Large Language Models for Southeast Asia

1 code implementation1 Dec 2023 Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Zhiqiang Hu, Chenhui Shen, Yew Ken Chia, Xingxuan Li, Jianyu Wang, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen yang, Chaoqun Liu, Hang Zhang, Lidong Bing

Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages.

Instruction Following

Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

6 code implementations CVPR 2024 Sicong Leng, Hang Zhang, Guanzheng Chen, Xin Li, Shijian Lu, Chunyan Miao, Lidong Bing

Large Vision-Language Models (LVLMs) have advanced considerably, intertwining visual recognition and language understanding to generate content that is not only coherent but also contextually attuned.

Hallucination Object +1

Spatially Covariant Image Registration with Text Prompts

1 code implementation27 Nov 2023 Xiang Chen, Min Liu, Rongguang Wang, Renjiu Hu, Dongdong Liu, Gaolei Li, Hang Zhang

Medical images are often characterized by their structured anatomical representations and spatially inhomogeneous contrasts.

Ranked #2 on Image Registration on Unpaired-abdomen-CT (using extra training data)

Computational Efficiency Image Registration +2

Noisy Pair Corrector for Dense Retrieval

no code implementations7 Nov 2023 Hang Zhang, Yeyun Gong, Xingwei He, Dayiheng Liu, Daya Guo, Jiancheng Lv, Jian Guo

Most dense retrieval models contain an implicit assumption: the training query-document pairs are exactly matched.

Code Search Text Retrieval +1

M&M3D: Multi-Dataset Training and Efficient Network for Multi-view 3D Object Detection

1 code implementation2 Nov 2023 Hang Zhang

In this research, I proposed a network structure for multi-view 3D object detection using camera-only data and a Bird's-Eye-View map.

3D Object Detection Domain Adaptation +3

The Phase Transition Phenomenon of Shuffled Regression

no code implementations31 Oct 2023 Hang Zhang, Ping Li

By linking this equation to the branching random walk process, we are able to characterize the impact of the signal-to-noise-ratio ($\snr$) on the permutation recovery.

regression

Optimal Estimator for Linear Regression with Shuffled Labels

no code implementations2 Oct 2023 Hang Zhang, Ping Li

This paper considers the task of linear regression with shuffled labels, i. e., $\mathbf Y = \mathbf \Pi \mathbf X \mathbf B + \mathbf W$, where $\mathbf Y \in \mathbb R^{n\times m}, \mathbf Pi \in \mathbb R^{n\times n}, \mathbf X\in \mathbb R^{n\times p}, \mathbf B \in \mathbb R^{p\times m}$, and $\mathbf W\in \mathbb R^{n\times m}$, respectively, represent the sensing results, (unknown or missing) corresponding information, sensing matrix, signal of interest, and additive sensing noise.

regression

DAGrid: Directed Accumulator Grid

1 code implementation5 Jun 2023 Hang Zhang, Renjiu Hu, Xiang Chen, Rongguang Wang, Jinwei Zhang, Jiahao Li

Specifically, the network incorporating DAGrid has realized a 70. 8% reduction in network parameter size and a 96. 8% decrease in FLOPs, while concurrently improving the Dice score for skin lesion segmentation by 1. 0% compared to state-of-the-art transformers.

Image Registration Lesion Segmentation +1

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

4 code implementations5 Jun 2023 Hang Zhang, Xin Li, Lidong Bing

We present Video-LLaMA a multi-modal framework that empowers Large Language Models (LLMs) with the capability of understanding both visual and auditory content in the video.

 Ranked #1 on Video-Text Retrieval on Test-of-Time (using extra training data)

Language Modeling Language Modelling +9

Physics-based network fine-tuning for robust quantitative susceptibility mapping from high-pass filtered phase

no code implementations5 May 2023 Jinwei Zhang, Alexey Dimov, Chao Li, Hang Zhang, Thanh D. Nguyen, Pascal Spincemaille, Yi Wang

Purpose: To improve the generalization ability of convolutional neural network (CNN) based prediction of quantitative susceptibility mapping (QSM) from high-pass filtered phase (HPFP) image.

SSIM

mcLARO: Multi-Contrast Learned Acquisition and Reconstruction Optimization for simultaneous quantitative multi-parametric mapping

no code implementations7 Apr 2023 Jinwei Zhang, Thanh D. Nguyen, Eddy Solomon, Chao Li, Qihao Zhang, Jiahao Li, Hang Zhang, Pascal Spincemaille, Yi Wang

Results: The retrospective ablation study showed improved image sharpness of mcLARO compared to the baseline network without multi-contrast sampling pattern optimization or image feature fusion, and negligible bias and narrow 95% limits of agreement on regional T1, T2, T2* and QSM values were obtained by the under-sampled reconstructions compared to the fully sampled reconstruction.

Image Reconstruction

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

2 code implementations29 Mar 2023 Xingwei He, Zhenghao Lin, Yeyun Gong, A-Long Jin, Hang Zhang, Chen Lin, Jian Jiao, Siu Ming Yiu, Nan Duan, Weizhu Chen

Many natural language processing (NLP) tasks rely on labeled data to train machine learning models with high performance.

Information Retrieval Retrieval

Sparse Recovery with Shuffled Labels: Statistical Limits and Practical Estimators

no code implementations20 Mar 2023 Hang Zhang, Ping Li

From the statistical aspect, we first establish the minimax lower bounds on the sample number $n$ and the \emph{signal-to-noise ratio} ($\snr$) for the correct recovery of permutation matrix $\bPitrue$ and the support set $\supp(\bbetatrue)$, to be more specific, $n \gtrsim k\log p$ and $\log\snr \gtrsim \log n + \frac{k\log p}{n}$.

DeDA: Deep Directed Accumulator

1 code implementation15 Mar 2023 Hang Zhang, Rongguang Wang, Renjiu Hu, Jinwei Zhang, Jiahao Li

Chronic active multiple sclerosis lesions, also termed as rim+ lesions, can be characterized by a hyperintense rim at the edge of the lesion on quantitative susceptibility maps.

Spatially Covariant Lesion Segmentation

no code implementations19 Jan 2023 Hang Zhang, Rongguang Wang, Jinwei Zhang, Dongdong Liu, Chao Li, Jiahao Li

Compared to natural images, medical images usually show stronger visual patterns and therefore this adds flexibility and elasticity to resource-limited clinical applications by injecting proper priors into neural networks.

Computational Efficiency Lesion Segmentation +2

APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning

2 code implementations14 Dec 2022 Jiashuo Sun, Hang Zhang, Chen Lin, Xiangdong Su, Yeyun Gong, Jian Guo

For the retriever, we adopt a number-aware negative sampling strategy to enable the retriever to be more discriminative on key numerical facts.

Conversational Question Answering Diversity +1

Is Bio-Inspired Learning Better than Backprop? Benchmarking Bio Learning vs. Backprop

no code implementations9 Dec 2022 Manas Gupta, Sarthak Ketanbhai Modi, Hang Zhang, Joon Hei Lee, Joo Hwee Lim

Four of the five Bio-algorithms tested outperform BP by upto 5% accuracy when only 20% of the training dataset is available.

Benchmarking

LARO: Learned Acquisition and Reconstruction Optimization to accelerate Quantitative Susceptibility Mapping

1 code implementation1 Nov 2022 Jinwei Zhang, Pascal Spincemaille, Hang Zhang, Thanh D. Nguyen, Chao Li, Jiahao Li, Ilhami Kovanlikaya, Mert R. Sabuncu, Yi Wang

In this paper, we present our new framework, called Learned Acquisition and Reconstruction Optimization (LARO), which aims to accelerate the multi-echo gradient echo (mGRE) pulse sequence for QSM.

Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning

no code implementations21 Oct 2022 Xingwei He, Yeyun Gong, A-Long Jin, Weizhen Qi, Hang Zhang, Jian Jiao, Bartuer Zhou, Biao Cheng, SM Yiu, Nan Duan

Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities.

Relational Reasoning Re-Ranking +1

Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis

1 code implementation18 Oct 2022 Shuai Fan, Chen Lin, Haonan Li, Zhenghao Lin, Jinsong Su, Hang Zhang, Yeyun Gong, Jian Guo, Nan Duan

Most existing pre-trained language representation models (PLMs) are sub-optimal in sentiment analysis tasks, as they capture the sentiment information from word-level while under-considering sentence-level information.

Contrastive Learning Language Modeling +4

PROD: Progressive Distillation for Dense Retrieval

1 code implementation27 Sep 2022 Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan

It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student.

Knowledge Distillation Natural Questions +1

Efficient Backward Reachability Using the Minkowski Difference of Constrained Zonotopes

no code implementations9 Jul 2022 Liren Yang, Hang Zhang, Jean-Baptiste Jeannin, Necmiye Ozay

This Minkowski difference needs to be represented as a constrained zonotope to enable subsequent computation, but, as we show, it is impossible to find a polynomial-sized representation for it in polynomial time.

Structure Learning in Graphical Models from Indirect Observations

no code implementations6 May 2022 Hang Zhang, Afshin Abdi, Faramarz Fekri

For the first time, we show that the correct graphical structure can be correctly recovered under the indefinite sensing system ($d < p$) using insufficient samples ($n < p$).

A General Compressive Sensing Construct using Density Evolution

no code implementations11 Apr 2022 Hang Zhang, Afshin Abdi, Faramarz Fekri

This paper proposes a general framework to design a sparse sensing matrix $\ensuremath{\mathbf{A}}\in \mathbb{R}^{m\times n}$, in a linear measurement system $\ensuremath{\mathbf{y}} = \ensuremath{\mathbf{Ax}}^{\natural} + \ensuremath{\mathbf{w}}$, where $\ensuremath{\mathbf{y}} \in \mathbb{R}^m$, $\ensuremath{\mathbf{x}}^{\natural}\in \RR^n$, and $\ensuremath{\mathbf{w}}$ denote the measurements, the signal with certain structures, and the measurement noise, respectively.

Compressive Sensing

A Density Evolution framework for Preferential Recovery of Covariance and Causal Graphs from Compressed Measurements

no code implementations17 Mar 2022 Muralikrishnna G. Sethuraman, Hang Zhang, Faramarz Fekri

In this paper, we propose a general framework for designing sensing matrix $\boldsymbol{A} \in \mathbb{R}^{d\times p}$, for estimation of sparse covariance matrix from compressed measurements of the form $\boldsymbol{y} = \boldsymbol{A}\boldsymbol{x} + \boldsymbol{n}$, where $\boldsymbol{y}, \boldsymbol{n} \in \mathbb{R}^d$, and $\boldsymbol{x} \in \mathbb{R}^p$.

Retrieval

Online Dynamic Parameter Estimation of an Alkaline Electrolysis System Based on Bayesian Inference

no code implementations8 Mar 2022 Xiaoyan Qiu, Hang Zhang, Yiwei Qiu, Buxiang Zhou, Tianlei Zang, Ruomei Qi, Jin Lin, Jiepeng Wang

When directly coupled with fluctuating energy sources such as wind and photovoltage power, the alkaline electrolysis (AEL) in a power-to-hydrogen (P2H) system is required to operate flexibly by dynamically adjusting its hydrogen production rate.

Bayesian Inference

Unified smoke and fire detection in an evolutionary framework with self-supervised progressive data augment

no code implementations16 Feb 2022 Hang Zhang, Su Yang, Hongyong Wang, zhongyan lu, helin sun

Few researches have studied simultaneous detection of smoke and flame accompanying fires due to their different physical natures that lead to uncertain fluid patterns.

Fire Detection Multi-Label Image Classification +1

FBNetV5: Neural Architecture Search for Multiple Tasks in One Run

no code implementations19 Nov 2021 Bichen Wu, Chaojian Li, Hang Zhang, Xiaoliang Dai, Peizhao Zhang, Matthew Yu, Jialiang Wang, Yingyan Lin, Peter Vajda

To tackle these challenges, we propose FBNetV5, a NAS framework that can search for neural architectures for a variety of vision tasks with much reduced computational cost and human effort.

Classification Image Classification +4

Adversarial Retriever-Ranker for dense text retrieval

1 code implementation ICLR 2022 Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu Chen

To address these challenges, we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.

Natural Questions Text Retrieval +1

An Investigation on Hardware-Aware Vision Transformer Scaling

no code implementations29 Sep 2021 Chaojian Li, KyungMin Kim, Bichen Wu, Peizhao Zhang, Hang Zhang, Xiaoliang Dai, Peter Vajda, Yingyan Lin

In particular, when transferred to PiT, our scaling strategies lead to a boosted ImageNet top-1 accuracy of from $74. 6\%$ to $76. 7\%$ ($\uparrow2. 1\%$) under the same 0. 7G FLOPs; and when transferred to the COCO object detection task, the average precision is boosted by $\uparrow0. 7\%$ under a similar throughput on a V100 GPU.

Image Classification object-detection +2

Poolingformer: Long Document Modeling with Pooling Attention

no code implementations10 May 2021 Hang Zhang, Yeyun Gong, Yelong Shen, Weisheng Li, Jiancheng Lv, Nan Duan, Weizhu Chen

We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA.

Meta-Learning-Based Deep Reinforcement Learning for Multiobjective Optimization Problems

1 code implementation6 May 2021 Zizhen Zhang, Zhiyuan Wu, Hang Zhang, Jiahai Wang

When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and efficiently deal with multiple subproblems determined by weight decomposition of objectives.

Combinatorial Optimization Deep Reinforcement Learning +5

Motion Artifact Reduction in Quantitative Susceptibility Mapping using Deep Neural Network

no code implementations4 May 2021 Chao Li, Hang Zhang, Jinwei Zhang, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang

An approach to reduce motion artifacts in Quantitative Susceptibility Mapping using deep learning is proposed.

Temporal Feature Fusion with Sampling Pattern Optimization for Multi-echo Gradient Echo Acquisition and Image Reconstruction

no code implementations10 Mar 2021 Jinwei Zhang, Hang Zhang, Chao Li, Pascal Spincemaille, Mert Sabuncu, Thanh D. Nguyen, Yi Wang

Quantitative imaging in MRI usually involves acquisition and reconstruction of a series of images at multi-echo time points, which possibly requires more scan time and specific reconstruction technique compared to conventional qualitative imaging.

Image Reconstruction

NeRD: Neural Representation of Distribution for Medical Image Segmentation

1 code implementation6 Mar 2021 Hang Zhang, Rongguang Wang, Jinwei Zhang, Chao Li, Gufeng Yang, Pascal Spincemaille, Thanh Nguyen, Yi Wang

We introduce Neural Representation of Distribution (NeRD) technique, a module for convolutional neural networks (CNNs) that can estimate the feature distribution by optimizing an underlying function mapping image coordinates to the feature distribution.

Image Segmentation Lesion Segmentation +2

Ensembling Low Precision Models for Binary Biomedical Image Segmentation

no code implementations16 Oct 2020 Tianyu Ma, Hang Zhang, Hanley Ong, Amar Vora, Thanh D. Nguyen, Ajay Gupta, Yi Wang, Mert Sabuncu

Our core idea is straightforward: A diverse ensemble of low precision and high recall models are likely to make different false positive errors (classifying background as foreground in different parts of the image), but the true positives will tend to be consistent.

Image Segmentation Lesion Segmentation +3

Geometric Loss for Deep Multiple Sclerosis lesion Segmentation

no code implementations29 Sep 2020 Hang Zhang, Jinwei Zhang, Rongguang Wang, Qihao Zhang, Susan A. Gauthier, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang

Multiple sclerosis (MS) lesions occupy a small fraction of the brain volume, and are heterogeneous with regards to shape, size and locations, which poses a great challenge for training deep learning based segmentation models.

Lesion Segmentation Segmentation

Differential Viewpoints for Ground Terrain Material Recognition

1 code implementation22 Sep 2020 Jia Xue, Hang Zhang, Ko Nishino, Kristin J. Dana

A key concept is differential angular imaging, where small angular variations in image capture enables angular-gradient features for an enhanced appearance representation that improves recognition.

Autonomous Driving Material Recognition +1

Efficient Folded Attention for 3D Medical Image Reconstruction and Segmentation

no code implementations13 Sep 2020 Hang Zhang, Jinwei Zhang, Rongguang Wang, Qihao Zhang, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang

Recently, 3D medical image reconstruction (MIR) and segmentation (MIS) based on deep neural networks have been developed with promising results, and attention mechanism has been further designed to capture global contextual information for performance enhancement.

Computational Efficiency Image Reconstruction +1

Probabilistic Dipole Inversion for Adaptive Quantitative Susceptibility Mapping

no code implementations7 Sep 2020 Jinwei Zhang, Hang Zhang, Mert Sabuncu, Pascal Spincemaille, Thanh Nguyen, Yi Wang

A learning-based posterior distribution estimation method, Probabilistic Dipole Inversion (PDI), is proposed to solve the quantitative susceptibility mapping (QSM) inverse problem in MRI with uncertainty estimation.

Density Estimation

Extending LOUPE for K-space Under-sampling Pattern Optimization in Multi-coil MRI

no code implementations28 Jul 2020 Jinwei Zhang, Hang Zhang, Alan Wang, Qihao Zhang, Mert Sabuncu, Pascal Spincemaille, Thanh D. Nguyen, Yi Wang

The previously established LOUPE (Learning-based Optimization of the Under-sampling Pattern) framework for optimizing the k-space sampling pattern in MRI was extended in three folds: firstly, fully sampled multi-coil k-space data from the scanner, rather than simulated k-space data from magnitude MR images in LOUPE, was retrospectively under-sampled to optimize the under-sampling pattern of in-vivo k-space data; secondly, binary stochastic k-space sampling, rather than approximate stochastic k-space sampling of LOUPE during training, was applied together with a straight-through (ST) estimator to estimate the gradient of the threshold operation in a neural network; thirdly, modified unrolled optimization network, rather than modified U-Net in LOUPE, was used as the reconstruction network in order to reconstruct multi-coil data properly and reduce the dependency on training data.

Character Matters: Video Story Understanding with Character-Aware Relations

no code implementations9 May 2020 Shijie Geng, Ji Zhang, Zuohui Fu, Peng Gao, Hang Zhang, Gerard de Melo

Without identifying the connection between appearing people and character names, a model is not able to obtain a genuine understanding of the plots.

Question Answering

Improving Semantic Segmentation via Self-Training

no code implementations30 Apr 2020 Yi Zhu, Zhongyue Zhang, Chongruo wu, Zhi Zhang, Tong He, Hang Zhang, R. Manmatha, Mu Li, Alexander Smola

In the case of semantic segmentation, this means that large amounts of pixelwise annotations are required to learn accurate models.

Domain Generalization Segmentation +1

Let's be Humorous: Knowledge Enhanced Humor Generation

no code implementations ACL 2020 Hang Zhang, Dayiheng Liu, Jiancheng Lv, Cheng Luo

To our knowledge, this is the first attempt to generate punchlines with knowledge enhanced model.

Sentence

FairNN- Conjoint Learning of Fair Representations for Fair Decisions

1 code implementation5 Apr 2020 Tongxin Hu, Vasileios Iosifidis, Wentong Liao, Hang Zhang, Michael YingYang, Eirini Ntoutsi, Bodo Rosenhahn

In this paper, we propose FairNN a neural network that performs joint feature representation and classification for fairness-aware learning.

Classification Decision Making +3

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

7 code implementations13 Mar 2020 Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, Alexander Smola

We introduce AutoGluon-Tabular, an open-source AutoML framework that requires only a single line of Python to train highly accurate machine learning models on an unprocessed tabular dataset such as a CSV file.

Molecular Property Prediction Neural Architecture Search

Skull-RCNN: A CNN-based network for the skull fracture detection

no code implementations MIDL 2019 Zhuo Kuang, Xianbo Deng, Li Yu, Hang Zhang, Xian lin, Hui Ma

Guiding by the morphological features of the skull, a skeleton-based region proposal method is proposed to make candidate boxes more concentrated in key regions and reduce invalid boxes.

Fracture detection Region Proposal

The Benefits of Diversity: Permutation Recovery in Unlabeled Sensing from Multiple Measurement Vectors

no code implementations5 Sep 2019 Hang Zhang, Martin Slawski, Ping Li

For the case in which both the signal and permutation are unknown, the problem is reformulated as a bi-convex optimization problem with an auxiliary variable, which can be solved by the Alternating Direction Method of Multipliers (ADMM).

Diversity

A Computer Vision Application for Assessing Facial Acne Severity from Selfie Images

no code implementations18 Jul 2019 Tingting Zhao, Hang Zhang, Jacob Spoelstra

We worked with Nestle SHIELD (Skin Health, Innovation, Education, and Longevity Development, NSH) to develop a deep learning model that is able to assess acne severity from selfie images as accurate as dermatologists.

Data Augmentation Transfer Learning

2nd Place Solution to the GQA Challenge 2019

no code implementations16 Jul 2019 Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas

We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering.

Question Answering Visual Question Answering +1

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

3 code implementations9 Jul 2019 Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).

Deep Learning

Co-Occurrent Features in Semantic Segmentation

2 code implementations CVPR 2019 Hang Zhang, Han Zhang, Chenguang Wang, Junyuan Xie

To leverage the semantic context in the co-occurrent features, we build an Aggregated Co-occurrent Feature (ACF) Module by aggregating the probability of the co-occurrent feature with the co-occurrent context.

Segmentation Semantic Segmentation

Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

2 code implementations26 Apr 2019 Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li

One difficulty we observe is that the noise in the stochastic momentum estimation is accumulated over time and will have delayed effects when the batch size changes.

Image Classification object-detection +3

Blaze: Simplified High Performance Cluster Computing

2 code implementations4 Feb 2019 Junhao Li, Hang Zhang

We present Blaze, a C++ library that makes it easy to develop high performance parallel programs for such compute intensive tasks.

Vocal Bursts Intensity Prediction

Bag of Tricks for Image Classification with Convolutional Neural Networks

27 code implementations CVPR 2019 Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li

Much of the recent progress made in image classification research can be credited to training procedure refinements, such as changes in data augmentations and optimization methods.

Domain Generalization General Classification +4

Deep Texture Manifold for Ground Terrain Recognition

1 code implementation CVPR 2018 Jia Xue, Hang Zhang, Kristin Dana

The GTOS database (comprised of over 30, 000 images of 40 classes of ground terrain in outdoor scenes) enables supervised recognition.

Context Encoding for Semantic Segmentation

12 code implementations CVPR 2018 Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal

In this paper, we explore the impact of global contextual information in semantic segmentation by introducing the Context Encoding Module, which captures the semantic context of scenes and selectively highlights class-dependent featuremaps.

Image Classification Segmentation +2

Photo-realistic Facial Texture Transfer

no code implementations14 Jun 2017 Parneet Kaur, Hang Zhang, Kristin J. Dana

We address the challenging problem of transferring face texture from a style face image to a content face image in a photorealistic manner without changing the identity of the original content image.

Style Transfer

Multi-style Generative Network for Real-time Transfer

6 code implementations20 Mar 2017 Hang Zhang, Kristin Dana

Despite the rapid progress in style transfer, existing approaches using feed-forward generative network for multi-style or arbitrary-style transfer are usually compromised of image quality and model flexibility.

Style Transfer

Deep TEN: Texture Encoding Network

12 code implementations CVPR 2017 Hang Zhang, Jia Xue, Kristin Dana

The representation is orderless and therefore is particularly useful for material and texture recognition.

Dictionary Learning Material Recognition

Differential Angular Imaging for Material Recognition

no code implementations CVPR 2017 Jia Xue, Hang Zhang, Kristin Dana, Ko Nishino

We realize this by developing a framework for differential angular imaging, where small angular variations in image capture provide an enhanced appearance representation and significant recognition improvement.

Material Recognition

Robust Matrix Regression

no code implementations15 Nov 2016 Hang Zhang, Fengyuan Zhu, Shixin Li

However, in real-world applications, it is common to see the training data contaminated by noises, which can affect the robustness of these matrix regression methods.

regression

Reflectance Hashing for Material Recognition

no code implementations CVPR 2015 Hang Zhang, Kristin Dana, Ko Nishino

Reflectance offers a unique signature of the material but is challenging to measure and use for recognizing materials due to its high-dimensionality.

Dictionary Learning Material Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.