Search Results for author: Han Zhang

Found 149 papers, 64 papers with code

Multi-lingual Geoparsing based on Machine Translation

no code implementations • 6 Nov 2015 • Xu Chen, Han Zhang, Judith Gelernter

Our results for geoparsing Chinese and Arabic text using our multi-lingual geoparsing method are comparable to our results for geoparsing English text with our English tools.

Machine Translation Translation

Paper
Add Code

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition

no code implementations • CVPR 2016 • Han Zhang, Tao Xu, Mohamed Elhoseiny, Xiaolei Huang, Shaoting Zhang, Ahmed Elgammal, Dimitris Metaxas

In this paper, we propose a new CNN architecture that integrates semantic part detection and abstraction (SPDA-CNN) for fine-grained classification.

General Classification Object Recognition +1

Paper
Add Code

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

21 code implementations • ICCV 2017 • Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas

Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications.

Ranked #3 on Text-to-Image Generation on Oxford 102 Flowers (Inception score metric)

Text-to-Image Generation

1,850

Paper
Code

SegAN: Adversarial Network with Multi-scale $L_1$ Loss for Medical Image Segmentation

2 code implementations • 6 Jun 2017 • Yuan Xue, Tao Xu, Han Zhang, Rodney Long, Xiaolei Huang

Extensive experimental results demonstrate the effectiveness of the proposed SegAN with multi-scale loss: on BRATS 2013 SegAN gives performance comparable to the state-of-the-art for whole tumor and tumor core segmentation while achieves better precision and sensitivity for Gd-enhance tumor core segmentation; on BRATS 2015 SegAN achieves better performance than the state-of-the-art in both dice score and precision.

Ranked #1 on Brain Tumor Segmentation on BRATS-2013 leaderboard

Brain Tumor Segmentation Image Segmentation +2

180

Paper
Code

Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision

no code implementations • CVPR 2017 • Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal

We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations.

Zero-Shot Learning

Paper
Add Code

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

16 code implementations • 19 Oct 2017 • Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas

In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images.

Ranked #5 on Text-to-Image Generation on Oxford 102 Flowers

Generative Adversarial Network Text-to-Image Generation

1,850

Paper
Code

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

19 code implementations • CVPR 2018 • Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He

In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation.

Ranked #1 on Text-to-Image Generation on MS-COCO

Generative Adversarial Network Image-text matching +2

1,319

Paper
Code

Improving GANs Using Optimal Transport

2 code implementations • ICLR 2018 • Tim Salimans, Han Zhang, Alec Radford, Dimitris Metaxas

We present Optimal Transport GAN (OT-GAN), a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution.

Image Generation

Paper
Code

Self-Attention Generative Adversarial Networks

49 code implementations • arXiv 2018 • Han Zhang, Ian Goodfellow, Dimitris Metaxas, Augustus Odena

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks.

Ranked #20 on Conditional Image Generation on ImageNet 128x128

Conditional Image Generation Generative Adversarial Network

17,571

Paper
Code

Deep Chronnectome Learning via Full Bidirectional Long Short-Term Memory Networks for MCI Diagnosis

no code implementations • 30 Aug 2018 • Weizheng Yan, Han Zhang, Jing Sui, Dinggang Shen

Dynamic functional connectivity (dFC), consisting of time-varying spatiotemporal dynamics, may characterize "chronnectome" diagnostic information for improving MCI classification.

Clustering General Classification +1

Paper
Add Code

A Unified Mammogram Analysis Method via Hybrid Deep Supervision

1 code implementation • 31 Aug 2018 • Rongzhao Zhang, Han Zhang, Albert C. S. Chung

In this work, we present a unified mammogram analysis framework for both whole-mammogram classification and segmentation.

Classification General Classification +3

Paper
Code

A Teacher-Student Framework for Maintainable Dialog Manager

no code implementations • EMNLP 2018 • Weikang Wang, Jiajun Zhang, Han Zhang, Mei-Yuh Hwang, Cheng-qing Zong, Zhifei Li

Specifically, the {``}student{''} is an extended dialog manager based on a new ontology, and the {``}teacher{''} is existing resources used for guiding the learning process of the {``}student{''}.

Reinforcement Learning (RL)

Paper
Add Code

Multi-Antenna Channel Interpolation via Tucker Decomposed Extreme Learning Machine

no code implementations • 26 Dec 2018 • Han Zhang, Bo Ai, Wenjun Xu, Li Xu, Shuguang Cui

Channel interpolation is an essential technique for providing high-accuracy estimation of the channel state information (CSI) for wireless systems design where the frequency-space structural correlations of multi-antenna channel are typically hidden in matrix or tensor forms.

Paper
Add Code

Deep Learning for Signal Demodulation in Physical Layer Wireless Communications: Prototype Platform, Open Dataset, and Analytics

no code implementations • 8 Mar 2019 • Hongmei Wang, Zhenzhen Wu, Shuai Ma, Songtao Lu, Han Zhang, Guoru Ding, Shiyin Li

In this paper, we investigate deep learning (DL)-enabled signal demodulation methods and establish the first open dataset of real modulated signals for wireless communication systems.

Paper
Add Code

Signal Demodulation with Machine Learning Methods for Physical Layer Visible Light Communications: Prototype Platform, Open Dataset and Algorithms

no code implementations • 13 Mar 2019 • Shuai Ma, Jiahui Dai, Songtao Lu, Hang Li, Han Zhang, Chun Du, Shiyin Li

The dataset is available online, which contains eight types of modulated signals.

Image Classification

Paper
Add Code

ERNIE: Enhanced Representation through Knowledge Integration

14 code implementations • 19 Apr 2019 • Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu

We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration).

Ranked #3 on Natural Language Inference on XNLI Chinese Dev

Chinese Named Entity Recognition Chinese Sentence Pair Classification +8

11,384

Paper
Code

Brain Network Construction and Classification Toolbox (BrainNetClass)

1 code implementation • 17 Jun 2019 • Zhen Zhou, Xiaobo Chen, Yu Zhang, Lishan Qiao, Renping Yu, Gang Pan, Han Zhang, Dinggang Shen

The goal of this work is to introduce a toolbox namely "Brain Network Construction and Classification" (BrainNetClass) to the field to promote more advanced brain network construction methods.

Classification General Classification

Paper
Code

DynWalks: Global Topology and Recent Changes Awareness Dynamic Network Embedding

2 code implementations • arXiv 2019 • Chengbin Hou, Han Zhang, Ke Tang, Shan He

Dynamic network embedding aims to learn low dimensional embeddings for unseen and seen nodes by using any currently available snapshots of a dynamic network.

Graph Reconstruction Link Prediction +1

Paper
Code

Approximation Capabilities of Neural ODEs and Invertible Residual Networks

no code implementations • ICML 2020 • Han Zhang, Xi Gao, Jacob Unterman, Tom Arodz

Neural ODEs and i-ResNet are recently proposed methods for enforcing invertibility of residual neural models.

Paper
Add Code

Multiple instance dense connected convolution neural network for aerial image scene classification

no code implementations • 22 Aug 2019 • Qi Bi, Kun Qin, Zhili Li, Han Zhang, Kai Xu

While the current convolution neural network tends to extract global features and global semantic information in a scene, the geo-spatial objects can be located at anywhere in an aerial image scene and their spatial arrangement tends to be more complicated.

General Classification Scene Classification

Paper
Add Code

Building change detection based on multi-scale filtering and grid partition

no code implementations • 22 Aug 2019 • Qi Bi, Kun Qin, Han Zhang, Wenjun Han, Zhili Li, Kai Xu

Exhaustive experiments indicate that the proposed method can detect building change types directly and outperform the current multi-index learning method.

Change Detection

Paper
Add Code

Distributed Equivalent Substitution Training for Large-Scale Recommender Systems

no code implementations • 10 Sep 2019 • Haidong Rong, Yangzihao Wang, Feihu Zhou, Junjie Zhai, Haiyang Wu, Rui Lan, Fan Li, Han Zhang, Yuekui Yang, Zhenyu Guo, Di Wang

We present Distributed Equivalent Substitution (DES) training, a novel distributed training framework for large-scale recommender systems with dynamic sparse features.

Recommendation Systems

Paper
Add Code

Distilling Effective Supervision from Severe Label Noise

2 code implementations • CVPR 2020 • Zizhao Zhang, Han Zhang, Sercan O. Arik, Honglak Lee, Tomas Pfister

For instance, on CIFAR100 with a $40\%$ uniform noise ratio and only 10 trusted labeled data per class, our method achieves $80. 2{\pm}0. 3\%$ classification accuracy, where the error rate is only $1. 4\%$ higher than a neural network trained without label noise.

Image Classification

32,758

Paper
Code

Differentiable Combinatorial Losses through Generalized Gradients of Linear Programs

no code implementations • 18 Oct 2019 • Xi Gao, Han Zhang, Aliakbar Panahi, Tom Arodz

When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference.

Combinatorial Optimization Graph Matching +1

Paper
Add Code

Consistency Regularization for Generative Adversarial Networks

no code implementations • ICLR 2020 • Han Zhang, Zizhao Zhang, Augustus Odena, Honglak Lee

Generative Adversarial Networks (GANs) are known to be difficult to train, despite considerable research effort.

Ranked #5 on Image Generation on CelebA-HQ 128x128

Conditional Image Generation Unconditional Image Generation

Paper
Add Code

Small-GAN: Speeding Up GAN Training Using Core-sets

no code implementations • ICML 2020 • Samarth Sinha, Han Zhang, Anirudh Goyal, Yoshua Bengio, Hugo Larochelle, Augustus Odena

Recent work by Brock et al. (2018) suggests that Generative Adversarial Networks (GANs) benefit disproportionately from large mini-batch sizes.

Active Learning Anomaly Detection +1

Paper
Add Code

Multimodal, Multilingual Grapheme-to-Phoneme Conversion for Low-Resource Languages

no code implementations • WS 2019 • James Route, Steven Hillis, Isak Czeresnia Etinger, Han Zhang, Alan W. black

Grapheme-to-phoneme conversion (g2p) is the task of predicting the pronunciation of words from their orthographic representation.

Paper
Add Code

ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring

3 code implementations • 21 Nov 2019 • David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel

Distribution alignment encourages the marginal distribution of predictions on unlabeled data to be close to the marginal distribution of ground-truth labels.

Ranked #1 on Semi-Supervised Image Classification on cifar10, 250 Labels

Semi-Supervised Image Classification

1,127

Paper
Code

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models

3 code implementations • CVPR 2020 • Giannis Daras, Augustus Odena, Han Zhang, Alexandros G. Dimakis

We introduce a new local sparse attention layer that preserves two-dimensional geometry and locality.

Ranked #19 on Conditional Image Generation on ImageNet 128x128

Conditional Image Generation Deep Attention

134

Paper
Code

MANELA: A Multi-Agent Algorithm for Learning Network Embeddings

no code implementations • 1 Dec 2019 • Han Zhang, Hong Xu

On the other hand, learning network embeddings on distributively stored networks still remained understudied: To the best of our knowledge, all existing algorithms for learning network embeddings have hitherto been exclusively centralized and thus cannot be applied to these networks.

BIG-bench Machine Learning Network Embedding

Paper
Add Code

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

27 code implementations • NeurIPS 2020 • Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, Colin Raffel

Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance.

Ranked #3 on Semi-Supervised Image Classification on SVHN, 1000 labels

Pseudo Label Semi-Supervised Image Classification

1,056

Paper
Code

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

4 code implementations • 26 Jan 2020 • Dongling Xiao, Han Zhang, Yukun Li, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks.

Ranked #1 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Dialogue Generation +3

11,384

Paper
Code

Improved Consistency Regularization for GANs

no code implementations • 11 Feb 2020 • Zhengli Zhao, Sameer Singh, Honglak Lee, Zizhao Zhang, Augustus Odena, Han Zhang

Recent work has increased the performance of Generative Adversarial Networks (GANs) by enforcing a consistency cost on the discriminator.

Image Generation

Paper
Add Code

Solving Missing-Annotation Object Detection with Background Recalibration Loss

2 code implementations • 12 Feb 2020 • Han Zhang, Fangyi Chen, Zhiqiang Shen, Qiqi Hao, Chenchen Zhu, Marios Savvides

In this paper, we introduce a superior solution called Background Recalibration Loss (BRL) that can automatically re-calibrate the loss signals according to the pre-defined IoU threshold and input image.

Object object-detection +1

Paper
Code

A multiple-instance densely-connected ConvNet for aerial scene classification

1 code implementation • IEEE Transactions on Image Processing 2020 • Qi Bi, Kun Qin, Zhili Li, Han Zhang, Kai Xu, Gui-Song Xia

It regards aerial scene classification as a multiple-instance learning problem so that local semantics can be further investigated.

Ranked #3 on Scene Recognition on AID

Aerial Scene Classification Classification +4

Paper
Code

ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring

1 code implementation • ICLR 2020 • David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel

We improve the recently-proposed ``MixMatch semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring.

129

Paper
Code

A Simple Semi-Supervised Learning Framework for Object Detection

7 code implementations • 10 May 2020 • Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, Tomas Pfister

Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine learning models using unlabeled data.

Ranked #13 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Data Augmentation Image Classification +4

398

Paper
Code

Towards Personalized and Semantic Retrieval: An End-to-End Solution for E-commerce Search via Embedding Learning

no code implementations • 3 Jun 2020 • Han Zhang, Songlin Wang, Kang Zhang, Zhiling Tang, Yunjiang Jiang, Yun Xiao, Weipeng Yan, Wen-Yun Yang

Two critical challenges stay in today's e-commerce search: how to retrieve items that are semantically relevant but not exact matching to query terms, and how to retrieve items that are more personalized to different users for the same search query.

Retrieval Semantic Retrieval

Paper
Add Code

Image Augmentations for GAN Training

no code implementations • 4 Jun 2020 • Zhengli Zhao, Zizhao Zhang, Ting Chen, Sameer Singh, Han Zhang

We provide new state-of-the-art results for conditional generation on CIFAR-10 with both consistency loss and contrastive loss as additional regularizations.

Image Augmentation Image Generation

Paper
Add Code

A Hybrid Evolutionary Algorithm for Reliable Facility Location Problem

no code implementations • 27 Jun 2020 • Han Zhang, Jialin Liu, Xin Yao

The reliable facility location problem (RFLP) is an important research topic of operational research and plays a vital role in the decision-making and management of modern supply chain and logistics.

Decision Making Management

Paper
Add Code

From Spectrum Wavelet to Vertex Propagation: Graph Convolutional Networks Based on Taylor Approximation

no code implementations • 1 Jul 2020 • Songyang Zhang, Han Zhang, Shuguang Cui, Zhi Ding

Graph convolutional networks (GCN) have been recently utilized to extract the underlying structures of datasets with some labeled data and high-dimensional features.

Node Classification

Paper
Add Code

Improving NER's Performance with Massive financial corpus

1 code implementation • 31 Jul 2020 • Han Zhang

Training large deep neural networks needs massive high quality annotation data, but the time and labor costs are too expensive for small business.

Language Modelling

Paper
Code

GloDyNE: Global Topology Preserving Dynamic Network Embedding

2 code implementations • 5 Aug 2020 • Chengbin Hou, Han Zhang, Shan He, Ke Tang

The main and common objective of Dynamic Network Embedding (DNE) is to efficiently update node embeddings while preserving network topology at each time step.

Graph Reconstruction Incremental Learning +1

Paper
Code

Co-evolution of Functional Brain Network at Multiple Scales during Early Infancy

no code implementations • 15 Sep 2020 • Xuyun Wen, Liming Hsu, Weili Lin, Han Zhang, Dinggang Shen

By applying our proposed methodological framework on the collected longitudinal infant dataset, we provided the first evidence that, in the first 2 years of life, the brain functional network is co-evolved at different scales, where each scale displays the unique reconfiguration pattern in terms of modular organization.

Paper
Add Code

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation

2 code implementations • ICLR 2021 • Yuliang Zou, Zizhao Zhang, Han Zhang, Chun-Liang Li, Xiao Bian, Jia-Bin Huang, Tomas Pfister

We demonstrate the effectiveness of the proposed pseudo-labeling strategy in both low-data and high-data regimes.

Ranked #5 on Semi-Supervised Semantic Segmentation on COCO 1/32 labeled

Data Augmentation Image Classification +2

162

Paper
Code

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding

2 code implementations • NAACL 2021 • Dongling Xiao, Yu-Kun Li, Han Zhang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

We argue that such contiguously masking method neglects to model the intra-dependencies and inter-relation of coarse-grained linguistic information.

Language Modelling Masked Language Modeling +3

11,384

Paper
Code

Transfer learning of chaotic systems

no code implementations • 15 Nov 2020 • Yali Guo, Han Zhang, Liang Wang, Huawei Fan, Xingang Wang

Here we investigate transfer learning of chaotic systems from the perspective of synchronization-based state inference, in which a reservoir computer trained by chaotic system A is used to infer the unmeasured variables of chaotic system B, while A is different from B in either parameter or dynamics.

General Knowledge Time Series +2

Paper
Add Code

Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

1 code implementation • 15 Dec 2020 • Han Zhang, Wenhao Zheng, Charley Chen, Kevin Gao, Yao Hu, Ling Huang, Wei Xu

Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.

Anomaly Detection Fraud Detection

Paper
Code

Cross-Modal Contrastive Learning for Text-to-Image Generation

1 code implementation • CVPR 2021 • Han Zhang, Jing Yu Koh, Jason Baldridge, Honglak Lee, Yinfei Yang

The quality of XMC-GAN's output is a major step up from previous models, as we show on three challenging datasets.

Ranked #27 on Text-to-Image Generation on MS COCO (using extra training data)

Contrastive Learning Generative Adversarial Network +1

100

Paper
Code

A Multiscale Graph Convolutional Network for Change Detection in Homogeneous and Heterogeneous Remote Sensing Images

no code implementations • 16 Feb 2021 • Junzheng Wu, Biao Li, Yao Qin, Weiping Ni, Han Zhang, Yuli Sun

In this paper, a novel CD method based on the graph convolutional network (GCN) and multiscale object-based technique is proposed for both homogeneous and heterogeneous images.

Change Detection

Paper
Add Code

Query Rewriting via Cycle-Consistent Translation for E-Commerce Search

no code implementations • 1 Mar 2021 • Yiming Qiu, Kang Zhang, Han Zhang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang

Online A/B experiments show that it improves core e-commerce business metrics significantly.

Machine Translation Translation

Paper
Add Code

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction

1 code implementation • ICLR 2021 • Wonkwang Lee, Whie Jung, Han Zhang, Ting Chen, Jing Yu Koh, Thomas Huang, Hyungsuk Yoon, Honglak Lee, Seunghoon Hong

Despite the recent advances in the literature, existing approaches are limited to moderately short-term prediction (less than a few seconds), while extrapolating it to a longer future quickly leads to destruction in structure and content.

Translation Video Prediction

Paper
Code

Transfer training from smaller language model

no code implementations • 23 Apr 2021 • Han Zhang

We initialize a larger target model from a smaller source model by copy weight values from source model and padding with zeros or small initialization values on it to make the source and target model have approximate outputs, which is valid due to block matrix multiplication and residual connection in transformer structure.

Language Modelling Large Language Model +1

Paper
Add Code

Learning Hamiltonian dynamics by reservoir computer

no code implementations • 24 Apr 2021 • Han Zhang, Huawei Fan, Liang Wang, Xingang Wang

Reconstructing the KAM dynamics diagram of Hamiltonian system from the time series of a limited number of parameters is an outstanding question in nonlinear science, especially when the Hamiltonian governing the system dynamics are unknown.

Time Series Time Series Analysis

Paper
Add Code

PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

4 code implementations • 26 Apr 2021 • Wei Zeng, Xiaozhe Ren, Teng Su, Hui Wang, Yi Liao, Zhiwei Wang, Xin Jiang, ZhenZhang Yang, Kaisheng Wang, Xiaoda Zhang, Chen Li, Ziyan Gong, Yifan Yao, Xinjing Huang, Jun Wang, Jianfeng Yu, Qi Guo, Yue Yu, Yan Zhang, Jin Wang, Hengtao Tao, Dasen Yan, Zexuan Yi, Fang Peng, Fangqing Jiang, Han Zhang, Lingfeng Deng, Yehong Zhang, Zhe Lin, Chao Zhang, Shaojie Zhang, Mingyue Guo, Shanzhi Gu, Gaojun Fan, YaoWei Wang, Xuefeng Jin, Qun Liu, Yonghong Tian

To enhance the generalization ability of PanGu-$\alpha$, we collect 1. 1TB high-quality Chinese data from a wide range of domains to pretrain the model.

Ranked #1 on Reading Comprehension (One-Shot) on DuReader

Cloze (multi-choices) (Few-Shot) Cloze (multi-choices) (One-Shot) +19

219

Paper
Code

Joint Learning of Deep Retrieval Model and Product Quantization based Embedding Index

1 code implementation • 9 May 2021 • Han Zhang, Hongwei Shen, Yiming Qiu, Yunjiang Jiang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang

Embedding index that enables fast approximate nearest neighbor(ANN) search, serves as an indispensable component for state-of-the-art deep retrieval systems.

Quantization Retrieval

Paper
Code

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

6 code implementations • 26 May 2021 • Zizhao Zhang, Han Zhang, Long Zhao, Ting Chen, Sercan O. Arik, Tomas Pfister

Hierarchical structures are popular in recent vision transformers, however, they require sophisticated designs and massive datasets to work well.

Ranked #84 on Image Classification on CIFAR-10

Image Classification Image Generation

29,680

Paper
Code

Improved Transformer for High-Resolution GANs

1 code implementation • NeurIPS 2021 • Long Zhao, Zizhao Zhang, Ting Chen, Dimitris N. Metaxas, Han Zhang

Attention-based models, exemplified by the Transformer, can effectively model long range dependency, but suffer from the quadratic complexity of self-attention operation, making them difficult to be adopted for high-resolution image generation based on Generative Adversarial Networks (GANs).

Ranked #2 on Image Generation on CelebA 256x256 (FID metric)

Image Generation Vocal Bursts Intensity Prediction

Paper
Code

SearchGCN: Powering Embedding Retrieval by Graph Convolution Networks for E-Commerce Search

no code implementations • 1 Jul 2021 • Xinlin Xia, Shang Wang, Han Zhang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang

Graph convolution networks (GCN), which recently becomes new state-of-the-art method for graph node classification, recommendation and other applications, has not been successfully applied to industrial-scale search engine yet.

Node Classification Retrieval

Paper
Add Code

Local semantic enhanced convnet for aerial scene recognition

1 code implementation • IEEE Transactions on Image Processing 2021 • Qi Bi, Kun Qin, Han Zhang, Gui-Song Xia

Our LSE-Net consists of a context enhanced convolutional feature extractor, a local semantic perception module and a classification layer.

Ranked #2 on Scene Recognition on AID

Aerial Scene Classification Image Classification +2

Paper
Code

ViTGAN: Training GANs with Vision Transformers

3 code implementations • ICLR 2022 • Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

Recently, Vision Transformers (ViTs) have shown competitive performance on image recognition while requiring less vision-specific inductive biases.

Ranked #68 on Image Generation on CIFAR-10

Image Generation

506

Paper
Code

Deep Image Synthesis from Intuitive User Input: A Review and Perspectives

no code implementations • 9 Jul 2021 • Yuan Xue, Yuan-Chen Guo, Han Zhang, Tao Xu, Song-Hai Zhang, Xiaolei Huang

In many applications of computer graphics, art and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph or layout, and have a computer system automatically generate photo-realistic images that adhere to the input content.

Image Generation Image Retrieval +1

Paper
Add Code

DeepAID: Interpreting and Improving Deep Learning-based Anomaly Detection in Security Applications

1 code implementation • 23 Sep 2021 • Dongqi Han, Zhiliang Wang, Wenqi Chen, Ying Zhong, Su Wang, Han Zhang, Jiahai Yang, Xingang Shi, Xia Yin

Experimental results show that DeepAID can provide high-quality interpretations for unsupervised DL models while meeting the special requirements of security domains.

Anomaly Detection

Paper
Code

Vector-quantized Image Modeling with Improved VQGAN

5 code implementations • ICLR 2022 • Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu

Motivated by this success, we explore a Vector-quantized Image Modeling (VIM) approach that involves pretraining a Transformer to predict rasterized image tokens autoregressively.

Image Generation Representation Learning +1

10,812

Paper
Code

You Ought to Look Around: Precise, Large Span Action Detection

no code implementations • 25th International Conference on Pattern Recognition (ICPR) 2021 • Ge Pan, Han Zhang, Fan Yu, Yonghong Song, Yuanlin Zhang, Han Yuan

In this paper, we propose a method called YOLA (You Ought to Look Around) which includes three parts: 1) a robust backbone SPN-I3D for extracting spatio-temporal features.

Action Detection Action Localization

Paper
Add Code

threaTrace: Detecting and Tracing Host-based Threats in Node Level Through Provenance Graph Learning

1 code implementation • 8 Nov 2021 • Su Wang, Zhiliang Wang, Tao Zhou, Xia Yin, Dongqi Han, Han Zhang, Hongbin Sun, Xingang Shi, Jiahai Yang

Recent studies propose leveraging the rich contextual information in data provenance to detect threats in a host.

Graph Learning Intrusion Detection

Paper
Code

BLT: Bidirectional Layout Transformer for Controllable Layout Generation

1 code implementation • 9 Dec 2021 • Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa

During inference, BLT first generates a draft layout from the input and then iteratively refines it into a high-quality layout by masking out low-confident attributes.

Paper
Code

Temporal Transformer Networks with Self-Supervision for Action Recognition

no code implementations • 14 Dec 2021 • Yongkang Zhang, Jun Li, Guoming Wu, Han Zhang, Zhiping Shi, Zhaoxun Liu, Zizhang Wu, Na Jiang

The temporal sequence self-supervision module we employ unprecedentedly adopts the streamlined strategy of "random batch random channel" to reverse the sequence of video frames, allowing robust extractions of motion information representation from inversed temporal dimensions and improving the generalization capability of the model.

Action Recognition Temporal Action Localization

Paper
Add Code

Learning to Prompt for Continual Learning

4 code implementations • CVPR 2022 • Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge.

Class Incremental Learning Image Classification

1,659

Paper
Code

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

2 code implementations • 31 Dec 2021 • Han Zhang, Weichong Yin, Yewei Fang, Lanxin Li, Boqiang Duan, Zhihua Wu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

To explore the landscape of large-scale pre-training for bidirectional text-image generation, we train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs which achieves state-of-the-art performance for both text-to-image and image-to-text tasks, obtaining an FID of 7. 9 on MS-COCO for text-to-image synthesis and best results on COCO-CN and AIC-ICC for image captioning.

Ranked #42 on Text-to-Image Generation on MS COCO

Image Captioning Quantization +2

11,380

Paper
Code

MAXIM: Multi-Axis MLP for Image Processing

1 code implementation • CVPR 2022 • Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks.

Ranked #1 on Deblurring on HIDE (trained on GOPRO)

Deblurring Image Deblurring +6

937

Paper
Code

MaskGIT: Masked Generative Image Transformer

6 code implementations • CVPR 2022 • Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman

At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

Ranked #2 on Text-to-Image Generation on LHQC

Image Manipulation Image Outpainting +1

1,110

Paper
Code

StyleBERT: Chinese pretraining by font style information

no code implementations • 21 Feb 2022 • Chao Lv, Han Zhang, Xinkai Du, Yunhao Zhang, Ying Huang, Wenhao Li, Jia Han, Shanshan Gu

With the success of down streaming task using English pre-trained language model, the pre-trained Chinese language model is also necessary to get a better performance of Chinese NLP task.

Language Modelling

Paper
Add Code

Topology-Preserving Segmentation Network: A Deep Learning Segmentation Framework for Connected Component

no code implementations • 27 Feb 2022 • Han Zhang, Lok Ming Lui

TPSN is a deformation-based model that yields a deformation map through a UNet, which takes the medical image and a template mask as inputs.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

Givens Coordinate Descent Methods for Rotation Matrix Learning in Trainable Embedding Indexes

no code implementations • ICLR 2022 • Yunjiang Jiang, Han Zhang, Yiming Qiu, Yun Xiao, Bo Long, Wen-Yun Yang

Product quantization (PQ) coupled with a space rotation, is widely used in modern approximate nearest neighbor (ANN) search systems to significantly compress the disk storage for embeddings and speed up the inner product computation.

Quantization

Paper
Add Code

PNM: Pixel Null Model for General Image Segmentation

no code implementations • 13 Mar 2022 • Han Zhang, Zihao Zhang, Wenhao Zheng, Wei Xu

A major challenge in image segmentation is classifying object boundaries.

Image Segmentation Object +2

Paper
Add Code

Learning Instance-Specific Adaptation for Cross-Domain Segmentation

no code implementations • 30 Mar 2022 • Yuliang Zou, Zizhao Zhang, Chun-Liang Li, Han Zhang, Tomas Pfister, Jia-Bin Huang

We propose a test-time adaptation method for cross-domain image segmentation.

Data Augmentation Domain Generalization +5

Paper
Add Code

Unitail: Detecting, Reading, and Matching in Retail Scene

no code implementations • 1 Apr 2022 • Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides

To make full use of computer vision technology in stores, it is required to consider the actual needs that fit the characteristics of the retail scene.

Ranked #1 on Dense Object Detection on SKU-110K

Benchmarking Dense Object Detection +2

Paper
Add Code

MaxViT: Multi-Axis Vision Transformer

14 code implementations • 4 Apr 2022 • Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module.

Ranked #1 on Object Detection on COCO 2017

Image Classification object-detection +1

29,676

Paper
Code

Powering Finetuning in Few-Shot Learning: Domain-Agnostic Bias Reduction with Selected Sampling

no code implementations • 7 Apr 2022 • Ran Tao, Han Zhang, Yutong Zheng, Marios Savvides

Class-agnostic bias is defined as the distribution shifting introduced by domain difference, which we propose Distribution Calibration Module(DCM) to reduce.

Few-Shot Learning

Paper
Add Code

DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning

2 code implementations • 10 Apr 2022 • Zifeng Wang, Zizhao Zhang, Sayna Ebrahimi, Ruoxi Sun, Han Zhang, Chen-Yu Lee, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

Continual learning aims to enable a single model to learn a sequence of tasks without catastrophic forgetting.

Continual Learning

371

Paper
Code

Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder

1 code implementation • 6 May 2022 • Zhenghao Liu, Han Zhang, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Xiaohua LI

These embeddings need to be high-dimensional to fit training signals and guarantee the retrieval effectiveness of dense retrievers.

Ranked #1 on Information Retrieval on MS MARCO

Dimensionality Reduction Information Retrieval +1

Paper
Code

Faithful Explanations for Deep Graph Models

no code implementations • 24 May 2022 • Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the nonlinear effect of edge features, while existing subgraph explanation methods are not faithful.

Anomaly Detection

Paper
Add Code

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

2 code implementations • 22 Jun 2022 • Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Ranked #1 on Text-to-Image Generation on LAION COCO

Machine Translation Text-to-Image Generation +1

506

Paper
Code

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

1 code implementation • 12 Aug 2022 • Yiming Qiu, Chenyu Zhao, Han Zhang, Jingwei Zhuo, TianHao Li, Xiaowei Zhang, Songlin Wang, Sulong Xu, Bo Long, Wen-Yun Yang

BERT-style models pre-trained on the general corpus (e. g., Wikipedia) and fine-tuned on specific task corpus, have recently emerged as breakthrough techniques in many NLP tasks: question answering, text classification, sequence labeling and so on.

Intent Detection Question Answering +3

Paper
Code

Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

no code implementations • 15 Aug 2022 • Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Xiang Li, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides

Beyond classification, Conv-Adapter can generalize to detection and segmentation tasks with more than 50% reduction of parameters but comparable performance to the traditional full fine-tuning.

Transfer Learning

Paper
Add Code

Hierarchical Graph Convolutional Network Built by Multiscale Atlases for Brain Disorder Diagnosis Using Functional Connectivity

no code implementations • 22 Sep 2022 • Mianxin Liu, Han Zhang, Feng Shi, Dinggang Shen

In this study, we propose a novel framework to perform multiscale FCN analysis for brain disorder diagnosis.

Paper
Add Code

Visual Prompt Tuning for Generative Transfer Learning

1 code implementation • CVPR 2023 • Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang

We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers.

Image Generation Transfer Learning +1

Paper
Code

Phenaki: Variable Length Video Generation From Open Domain Textual Description

2 code implementations • 5 Oct 2022 • Ruben Villegas, Mohammad Babaeizadeh, Pieter-Jan Kindermans, Hernan Moraldo, Han Zhang, Mohammad Taghi Saffar, Santiago Castro, Julius Kunze, Dumitru Erhan

To the best of our knowledge, this is the first time a paper studies generating videos from time variable prompts.

Ranked #4 on Video Prediction on BAIR Robot Pushing

Video Generation Video Prediction

711

Paper
Code

Towards Semi-automatic Detection and Localization of Indoor Accessibility Issues using Mobile Depth Scanning and Computer Vision

no code implementations • 5 Oct 2022 • Xia Su, Kaiming Cheng, Han Zhang, Jaewook Lee, Jon E. Froehlich

To help improve the safety and accessibility of indoor spaces, researchers and health professionals have created assessment instruments that enable homeowners and trained experts to audit and improve homes.

Paper
Add Code

Topology-Preserving Segmentation Network

no code implementations • 7 Oct 2022 • Han Zhang, Lok Ming Lui

Comparing to the segmentation framework based on pixel-wise classification, deformation-based segmentation models that warp a template to enclose the regions are more convenient to enforce geometric constraints.

Image Segmentation Medical Image Segmentation +2

Paper
Add Code

Using Language to Extend to Unseen Domains

1 code implementation • 18 Oct 2022 • Lisa Dunlap, Clara Mohri, Devin Guillory, Han Zhang, Trevor Darrell, Joseph E. Gonzalez, aditi raghunathan, Anja Rohrbach

It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.

Domain Adaptation

Paper
Code

GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization

1 code implementation • 4 Nov 2022 • Xuhai Xu, Han Zhang, Yasaman Sefidgar, Yiyi Ren, Xin Liu, Woosuk Seo, Jennifer Brown, Kevin Kuehn, Mike Merrill, Paula Nurius, Shwetak Patel, Tim Althoff, Margaret E. Morris, Eve Riskin, Jennifer Mankoff, Anind K. Dey

We envision our multi-year datasets can support the ML community in developing generalizable longitudinal behavior modeling algorithms.

Depression Detection Domain Generalization

272

Paper
Code

MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

1 code implementation • CVPR 2023 • Tianhong Li, Huiwen Chang, Shlok Kumar Mishra, Han Zhang, Dina Katabi, Dilip Krishnan

In this work, we propose MAsked Generative Encoder (MAGE), the first framework to unify SOTA image generation and self-supervised representation learning.

Ranked #2 on Unconditional Image Generation on ImageNet 256x256

Image Generation Representation Learning +1

448

Paper
Code

Cost Splitting for Multi-Objective Conflict-Based Search

no code implementations • 23 Nov 2022 • Cheng Ge, Han Zhang, Jiaoyang Li, Sven Koenig

Our theoretical results show that, when combined with either of these two new splitting strategies, MO-CBS maintains its completeness and optimality guarantees.

Multi-Agent Path Finding

Paper
Add Code

Dimensionality-Varying Diffusion Process

no code implementations • CVPR 2023 • Han Zhang, Ruili Feng, Zhantao Yang, Lianghua Huang, Yu Liu, Yifei Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension.

Image Generation

Paper
Add Code

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation

1 code implementation • ICCV 2023 • Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Yansong Tang, Xiu Li

To address these problems, we propose FineDance, which contains 14. 6 hours of music-dance paired data, with fine-grained hand motions, fine-grained genres (22 dance genres), and accurate posture.

Motion Synthesis Retrieval

Paper
Code

MAGVIT: Masked Generative Video Transformer

1 code implementation • CVPR 2023 • Lijun Yu, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann, Ming-Hsuan Yang, Yuan Hao, Irfan Essa, Lu Jiang

We introduce the MAsked Generative VIdeo Transformer, MAGVIT, to tackle various video synthesis tasks with a single model.

Ranked #1 on Video Prediction on Something-Something V2

Multi-Task Learning Text-to-Video Generation +2

842

Paper
Code

Enhanced Training of Query-Based Object Detection via Selective Query Recollection

2 code implementations • CVPR 2023 • Fangyi Chen, Han Zhang, Kai Hu, Yu-Kai Huang, Chenchen Zhu, Marios Savvides

This paper investigates a phenomenon where query-based object detectors mispredict at the last decoding stage while predicting correctly at an intermediate stage.

Ranked #10 on Object Detection on COCO 2017 val

Attribute Object +2

1,813

Paper
Code

Muse: Text-To-Image Generation via Masked Generative Transformers

4 code implementations • 2 Jan 2023 • Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

Ranked #1 on Text-to-Image Generation on MS-COCO (FID metric)

Language Modelling Large Language Model +1

813

Paper
Code

Evolution of Filter Bubbles and Polarization in News Recommendation

no code implementations • 26 Jan 2023 • Han Zhang, Ziwei Zhu, James Caverlee

However, most existing work focuses on a static setting or over a short-time window, leaving open questions about the long-term and dynamic impacts of news recommendations.

News Recommendation Recommendation Systems

Paper
Add Code

VQ3D: Learning a 3D-Aware Generative Model on ImageNet

no code implementations • ICCV 2023 • Kyle Sargent, Jing Yu Koh, Han Zhang, Huiwen Chang, Charles Herrmann, Pratul Srinivasan, Jiajun Wu, Deqing Sun

Recent work has shown the possibility of training generative models of 3D content from 2D image collections on small datasets corresponding to a single object class, such as human faces, animal faces, or cars.

Position

Paper
Add Code

Deep learning reveals the common spectrum underlying multiple brain disorders in youth and elders from brain functional networks

no code implementations • 23 Feb 2023 • Mianxin Liu, Jingyang Zhang, Yao Wang, Yan Zhou, Fang Xie, Qihao Guo, Feng Shi, Han Zhang, Qian Wang, Dinggang Shen

Brain disorders in the early and late life of humans potentially share pathological alterations in brain functions.

Paper
Add Code

StraIT: Non-autoregressive Generation with Stratified Image Transformer

no code implementations • 1 Mar 2023 • Shengju Qian, Huiwen Chang, Yuanzhen Li, Zizhao Zhang, Jiaya Jia, Han Zhang

We propose Stratified Image Transformer(StraIT), a pure non-autoregressive(NAR) generative model that demonstrates superiority in high-quality image synthesis over existing autoregressive(AR) and diffusion models(DMs).

Image Generation

Paper
Add Code

Streaming Kernel PCA Algorithm With Small Space

no code implementations • 8 Mar 2023 • Yichuan Deng, Zhao Song, Zifan Wang, Han Zhang

The kernel method, which is commonly used in learning algorithms such as Support Vector Machines (SVMs), has also been applied in PCA algorithms.

Paper
Add Code

Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning

2 code implementations • 16 Mar 2023 • Zhuowei Li, Long Zhao, Zizhao Zhang, Han Zhang, Di Liu, Ting Liu, Dimitris N. Metaxas

In the context of continual learning, prototypes-as representative class embeddings-offer advantages in memory conservation and the mitigation of catastrophic forgetting.

Class Incremental Learning Contrastive Learning +1

Paper
Code

Delayed and Indirect Impacts of Link Recommendations

no code implementations • 17 Mar 2023 • Han Zhang, Shangen Lu, Yixin Wang, Mihaela Curmei

Moreover, we show that the effects of recommendations can persist in networks, in part due to their indirect impacts on natural dynamics even after recommendations are turned off.

counterfactual

Paper
Add Code

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

1 code implementation • ICCV 2023 • Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang

Diffusion models have achieved remarkable success in text-to-image generation, enabling the creation of high-quality images from text prompts or other modalities.

Data Augmentation Efficient Diffusion Personalization +1

354

Paper
Code

Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models

no code implementations • 5 Apr 2023 • Xuhui Jia, Yang Zhao, Kelvin C. K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su

This paper proposes a method for generating images of customized objects specified by users.

Caption Generation Image Generation +1

Paper
Add Code

Identity Encoder for Personalized Diffusion

no code implementations • 14 Apr 2023 • Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia

Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.

Image Enhancement Image Generation

Paper
Add Code

Pelphix: Surgical Phase Recognition from X-ray Images in Percutaneous Pelvic Fixation

no code implementations • 18 Apr 2023 • Benjamin D. Killeen, Han Zhang, Jan Mangulabnan, Mehran Armand, Russel H. Taylor, Greg Osgood, Mathias Unberath

Surgical phase recognition (SPR) is a crucial element in the digital transformation of the modern operating theater.

Anatomy Surgical phase recognition

Paper
Add Code

Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

1 code implementation • 24 Apr 2023 • Chenlin Zhou, Liutao Yu, Zhaokun Zhou, Zhengyu Ma, Han Zhang, Huihui Zhou, Yonghong Tian

Based on this residual design, we develop Spikingformer, a pure transformer-based spiking neural network.

Paper
Code

Mining fMRI Dynamics with Parcellation Prior for Brain Disease Diagnosis

no code implementations • 4 May 2023 • Xiaozhao Liu, Mianxin Liu, Lang Mei, Yuyao Zhang, Feng Shi, Han Zhang, Dinggang Shen

To characterize atypical brain dynamics under diseases, prevalent studies investigate functional magnetic resonance imaging (fMRI).

Graph Learning Multiple Instance Learning

Paper
Add Code

Enhancing the Performance of Transformer-based Spiking Neural Networks by SNN-optimized Downsampling with Precise Gradient Backpropagation

1 code implementation • 10 May 2023 • Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Zhengyu Ma, Huihui Zhou, Xiaopeng Fan, Yonghong Tian

In this paper, we propose ConvBN-MaxPooling-LIF (CML), an SNN-optimized downsampling with precise gradient backpropagation.

Paper
Code

Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation

1 code implementation • NeurIPS 2023 • Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell

As such, we explore how natural language descriptions of the domains seen in training data can be used with large vision models trained on diverse pretraining datasets to generate useful variations of the training data.

Domain Generalization Image Augmentation

Paper
Code

Learning Disentangled Prompts for Compositional Image Synthesis

no code implementations • 1 Jun 2023 • Kihyuk Sohn, Albert Shaw, Yuan Hao, Han Zhang, Luisa Polania, Huiwen Chang, Lu Jiang, Irfan Essa

We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis.

Domain Adaptation Image Generation +1

Paper
Add Code

Eliminating Lipschitz Singularities in Diffusion Models

no code implementations • 20 Jun 2023 • Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which employ stochastic differential equations to sample images through integrals, have emerged as a dominant class of generative models.

Paper
Add Code

ArrayBot: Reinforcement Learning for Generalizable Distributed Manipulation through Touch

no code implementations • 29 Jun 2023 • Zhengrong Xue, Han Zhang, Jingwen Cheng, Zhengmao He, Yuanchen Ju, Changyi Lin, Gu Zhang, Huazhe Xu

We present ArrayBot, a distributed manipulation system consisting of a $16 \times 16$ array of vertically sliding pillars integrated with tactile sensors, which can simultaneously support, perceive, and manipulate the tabletop objects.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Errors Dynamics in Affine Group Systems

no code implementations • 31 Jul 2023 • Xinghan Li, Jianqi Chen, Han Zhang, Jieqiang Wei, Junfeng Wu

In this paper, we focus on the error dynamics analysis for an affine group system under external disturbances or random noises.

Paper
Add Code

NEON: Living Needs Prediction System in Meituan

no code implementations • 31 Jul 2023 • Xiaochong Lan, Chen Gao, Shiqi Wen, Xiuqi Chen, Yingge Che, Han Zhang, Huazhou Wei, Hengliang Luo, Yong Li

To address these two challenges, we design a system of living NEeds predictiON named NEON, consisting of three phases: feature mining, feature fusion, and multi-task prediction.

Paper
Add Code

Differentiable Retrieval Augmentation via Generative Language Modeling for E-commerce Query Intent Classification

no code implementations • 18 Aug 2023 • Chenyu Zhao, Yunjiang Jiang, Yiming Qiu, Han Zhang, Wen-Yun Yang

Retrieval augmentation, which enhances downstream models by a knowledge retriever and an external corpus instead of by merely increasing the number of model parameters, has been successfully applied to many natural language processing (NLP) tasks such as text classification, question answering and so on.

intent-classification Intent Classification +5

Paper
Add Code

Price-Discrimination Game for Distributed Resource Management in Federated Learning

no code implementations • 26 Aug 2023 • Han Zhang, Halvin Yang, Guopeng Zhang

In vanilla federated learning (FL) such as FedAvg, the parameter server (PS) and multiple distributed clients can form a typical buyer's market, where the number of PS/buyers of FL services is far less than the number of clients/sellers.

Federated Learning Management

Paper
Add Code

Chat2Brain: A Method for Mapping Open-Ended Semantic Queries to Brain Activation Maps

no code implementations • 10 Sep 2023 • Yaonai Wei, Tuo Zhang, Han Zhang, Tianyang Zhong, Lin Zhao, Zhengliang Liu, Chong Ma, Songyao Zhang, Muheng Shang, Lei Du, Xiao Li, Tianming Liu, Junwei Han

In this study, we propose a method called Chat2Brain that combines LLMs to basic text-2-image model, known as Text2Brain, to map open-ended semantic queries to brain activation maps in data-scarce and complex query environments.

Paper
Add Code

Deformation-Invariant Neural Network and Its Applications in Distorted Image Restoration and Analysis

no code implementations • 4 Oct 2023 • Han Zhang, Qiguang Chen, Lok Ming Lui

The QCTN is a deep neural network that outputs a quasiconformal map, which can be used to transform a geometrically distorted image into an improved version that is closer to the distribution of natural or good images.

Image Classification Image Restoration +1

Paper
Add Code

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency

no code implementations • 5 Oct 2023 • Tianhong Li, Sangnie Bhardwaj, Yonglong Tian, Han Zhang, Jarred Barber, Dina Katabi, Guillaume Lajoie, Huiwen Chang, Dilip Krishnan

We demonstrate image generation and captioning performance on par with state-of-the-art text-to-image and image-to-text models with orders of magnitude fewer (only 3M) paired image-text data.

Text-to-Image Generation

Paper
Add Code

Accurate battery lifetime prediction across diverse aging conditions with deep learning

no code implementations • 8 Oct 2023 • Han Zhang, Yuqi Li, Shun Zheng, Ziheng Lu, Xiaofan Gui, Wei Xu, Jiang Bian

Here we introduce a universal deep learning approach that is capable of accommodating various aging conditions and facilitating effective learning under low-resource conditions by leveraging data from rich conditions.

Single-cell modeling

Paper
Add Code

Towards Foundation Models for Learning on Tabular Data

no code implementations • 11 Oct 2023 • Han Zhang, Xumeng Wen, Shun Zheng, Wei Xu, Jiang Bian

Despite considerable efforts in developing effective learning models for tabular data, current transferable tabular models remain in their infancy, limited by either the lack of support for direct instruction following in new tasks or the neglect of acquiring foundational knowledge and capabilities from diverse tabular datasets.

Instruction Following Language Modelling +1

Paper
Add Code

BatteryML:An Open-source platform for Machine Learning on Battery Degradation

1 code implementation • 23 Oct 2023 • Han Zhang, Xiaofan Gui, Shun Zheng, Ziheng Lu, Yuqi Li, Jiang Bian

Battery degradation remains a pivotal concern in the energy storage domain, with machine learning emerging as a potent tool to drive forward insights and solutions.

420

Paper
Code

COPR: Continual Learning Human Preference through Optimal Policy Regularization

no code implementations • 24 Oct 2023 • Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu

The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences.

Continual Learning reinforcement-learning

Paper
Add Code

K-BMPC: Derivative-based Koopman Bilinear Model Predictive Control For Tractor-trailer Trajectory Tracking With Unknown Parameters

no code implementations • 15 Nov 2023 • Zehao Wang, Han Zhang, Jingchuan Wang

Koopman Bilinear Model Predictive control (K-BMPC) is proposed to solve the trajectory tracking problem.

Computational Efficiency Model Predictive Control

Paper
Add Code

GFS: Graph-based Feature Synthesis for Prediction over Relational Databases

no code implementations • 4 Dec 2023 • Han Zhang, Quan Gan, David Wipf, Weinan Zhang

Consequently, the prevalent approach for training machine learning models on data stored in relational databases involves performing feature engineering to merge the data from multiple tables into a single table and subsequently applying single table models.

Feature Engineering Inductive Bias

Paper
Add Code

CCM: Adding Conditional Controls to Text-to-Image Consistency Models

no code implementations • 12 Dec 2023 • Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha

Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality.

Paper
Add Code

An Incentive Mechanism for Federated Learning Based on Multiple Resource Exchange

no code implementations • 13 Dec 2023 • Ruonan Dong, Hui Xu, Han Zhang, Guopeng Zhang

We formulate the interaction between MO and DOs as an optimization problem, and the objective is to effectively utilize the communication and computing resource of the MO and DOs to minimize the time to complete an FL task.

Federated Learning

Paper
Add Code

VideoLCM: Video Latent Consistency Model

2 code implementations • 14 Dec 2023 • Xiang Wang, Shiwei Zhang, Han Zhang, Yu Liu, Yingya Zhang, Changxin Gao, Nong Sang

Consistency models have demonstrated powerful capability in efficient image generation and allowed synthesis within a few sampling steps, alleviating the high computational cost in diffusion models.

Computational Efficiency Image Generation +1

2,560

Paper
Code

Gemini: A Family of Highly Capable Multimodal Models

no code implementations • The Keyword 2023 • Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Jack Krawczyk, Ed Chi, Heng-Tze Cheng, Eric Ni, Purvi Shah, Patrick Kane, Betty Chan, Manaal Faruqui, Aliaksei Severyn, Hanzhao Lin, Yaguang Li, Yong Cheng, Mahdis Mahdieh, Mia Chen, Pei Sun, Dustin Tran, Sumit Bagri, Balaji Lakshminarayanan, Jeremiah Liu, Andras Orban, Fabian Güra, Hao Zhou, Xinying Song, Aurelien Boffy, Harish Ganapathy, Steven Zheng, HyunJeong Choe, Ágoston Weisz, Tao Zhu, Yifeng Lu, Siddharth Gopal, Jarrod Kahn, Maciej Kula, Jeff Pitman, Rushin Shah, Emanuel Taropa, Majd Al Merey, Martin Baeuml, Zhifeng Chen, Laurent El Shafey, Yujing Zhang, Olcan Sercinoglu, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, Alexandre Frechette, Charlotte Smith, Laura Culp, Lev Proleev, Yi Luan, Xi Chen, James Lottes, Nathan Schucher, Federico Lebron, Alban Rrustemi, Natalie Clay, Phil Crone, Tomas Kocisky, Jeffrey Zhao, Bartek Perz, Dian Yu, Heidi Howard, Adam Bloniarz, Jack W. Rae, Han Lu, Laurent SIfre, Marcello Maggioni, Fred Alcober, Dan Garrette, Megan Barnes, Shantanu Thakoor, Jacob Austin, Gabriel Barth-Maron, William Wong, Rishabh Joshi, Rahma Chaabouni, Deeni Fatiha, Arun Ahuja, Gaurav Singh Tomar, Evan Senter, Martin Chadwick, Ilya Kornakov, Nithya Attaluri, Iñaki Iturrate, Ruibo Liu, Yunxuan Li, Sarah Cogan, Jeremy Chen, Chao Jia, Chenjie Gu, Qiao Zhang, Jordan Grimstad, Ale Jakse Hartman, Xavier Garcia, Thanumalayan Sankaranarayana Pillai, Jacob Devlin, Michael Laskin, Diego de Las Casas, Dasha Valter, Connie Tao, Lorenzo Blanco, Adrià Puigdomènech Badia, David Reitter, Mianna Chen, Jenny Brennan, Clara Rivera, Sergey Brin, Shariq Iqbal, Gabriela Surita, Jane Labanowski, Abhi Rao, Stephanie Winkler, Emilio Parisotto, Yiming Gu, Kate Olszewska, Ravi Addanki, Antoine Miech, Annie Louis, Denis Teplyashin, Geoff Brown, Elliot Catt, Jan Balaguer, Jackie Xiang, Pidong Wang, Zoe Ashwood, Anton Briukhov, Albert Webson, Sanjay Ganapathy, Smit Sanghavi, Ajay Kannan, Ming-Wei Chang, Axel Stjerngren, Josip Djolonga, Yuting Sun, Ankur Bapna, Matthew Aitchison, Pedram Pejman, Henryk Michalewski, Tianhe Yu, Cindy Wang, Juliette Love, Junwhan Ahn, Dawn Bloxwich, Kehang Han, Peter Humphreys, Thibault Sellam, James Bradbury, Varun Godbole, Sina Samangooei, Bogdan Damoc, Alex Kaskasoli, Sébastien M. R. Arnold, Vijay Vasudevan, Shubham Agrawal, Jason Riesa, Dmitry Lepikhin, Richard Tanburn, Srivatsan Srinivasan, Hyeontaek Lim, Sarah Hodkinson, Pranav Shyam, Johan Ferret, Steven Hand, Ankush Garg, Tom Le Paine, Jian Li, Yujia Li, Minh Giang, Alexander Neitz, Zaheer Abbas, Sarah York, Machel Reid, Elizabeth Cole, Aakanksha Chowdhery, Dipanjan Das, Dominika Rogozińska, Vitaliy Nikolaev, Pablo Sprechmann, Zachary Nado, Lukas Zilka, Flavien Prost, Luheng He, Marianne Monteiro, Gaurav Mishra, Chris Welty, Josh Newlan, Dawei Jia, Miltiadis Allamanis, Clara Huiyi Hu, Raoul de Liedekerke, Justin Gilmer, Carl Saroufim, Shruti Rijhwani, Shaobo Hou, Disha Shrivastava, Anirudh Baddepudi, Alex Goldin, Adnan Ozturel, Albin Cassirer, Yunhan Xu, Daniel Sohn, Devendra Sachan, Reinald Kim Amplayo, Craig Swanson, Dessie Petrova, Shashi Narayan, Arthur Guez, Siddhartha Brahma, Jessica Landon, Miteyan Patel, Ruizhe Zhao, Kevin Villela, Luyu Wang, Wenhao Jia, Matthew Rahtz, Mai Giménez, Legg Yeung, James Keeling, Petko Georgiev, Diana Mincu, Boxi Wu, Salem Haykal, Rachel Saputro, Kiran Vodrahalli, James Qin, Zeynep Cankara, Abhanshu Sharma, Nick Fernando, Will Hawkins, Behnam Neyshabur, Solomon Kim, Adrian Hutter, Priyanka Agrawal, Alex Castro-Ros, George van den Driessche, Tao Wang, Shuo-Yiin Chang, Paul Komarek, Ross Mcilroy, Mario Lučić, Guodong Zhang, Wael Farhan, Michael Sharman, Paul Natsev, Paul Michel, Yamini Bansal, Siyuan Qiao, Kris Cao, Siamak Shakeri, Christina Butterfield, Justin Chung, Paul Kishan Rubenstein, Shivani Agrawal, Arthur Mensch, Kedar Soparkar, Karel Lenc, Timothy Chung, Aedan Pope, Loren Maggiore, Jackie Kay, Priya Jhakra, Shibo Wang, Joshua Maynez, Mary Phuong, Taylor Tobin, Andrea Tacchetti, Maja Trebacz, Kevin Robinson, Yash Katariya, Sebastian Riedel, Paige Bailey, Kefan Xiao, Nimesh Ghelani, Lora Aroyo, Ambrose Slone, Neil Houlsby, Xuehan Xiong, Zhen Yang, Elena Gribovskaya, Jonas Adler, Mateo Wirth, Lisa Lee, Music Li, Thais Kagohara, Jay Pavagadhi, Sophie Bridgers, Anna Bortsova, Sanjay Ghemawat, Zafarali Ahmed, Tianqi Liu, Richard Powell, Vijay Bolina, Mariko Iinuma, Polina Zablotskaia, James Besley, Da-Woon Chung, Timothy Dozat, Ramona Comanescu, Xiance Si, Jeremy Greer, Guolong Su, Martin Polacek, Raphaël Lopez Kaufman, Simon Tokumine, Hexiang Hu, Elena Buchatskaya, Yingjie Miao, Mohamed Elhawaty, Aditya Siddhant, Nenad Tomasev, Jinwei Xing, Christina Greer, Helen Miller, Shereen Ashraf, Aurko Roy, Zizhao Zhang, Ada Ma, Angelos Filos, Milos Besta, Rory Blevins, Ted Klimenko, Chih-Kuan Yeh, Soravit Changpinyo, Jiaqi Mu, Oscar Chang, Mantas Pajarskas, Carrie Muir, Vered Cohen, Charline Le Lan, Krishna Haridasan, Amit Marathe, Steven Hansen, Sholto Douglas, Rajkumar Samuel, Mingqiu Wang, Sophia Austin, Chang Lan, Jiepu Jiang, Justin Chiu, Jaime Alonso Lorenzo, Lars Lowe Sjösund, Sébastien Cevey, Zach Gleicher, Thi Avrahami, Anudhyan Boral, Hansa Srinivasan, Vittorio Selo, Rhys May, Konstantinos Aisopos, Léonard Hussenot, Livio Baldini Soares, Kate Baumli, Michael B. Chang, Adrià Recasens, Ben Caine, Alexander Pritzel, Filip Pavetic, Fabio Pardo, Anita Gergely, Justin Frye, Vinay Ramasesh, Dan Horgan, Kartikeya Badola, Nora Kassner, Subhrajit Roy, Ethan Dyer, Víctor Campos Campos, Alex Tomala, Yunhao Tang, Dalia El Badawy, Elspeth White, Basil Mustafa, Oran Lang, Abhishek Jindal, Sharad Vikram, Zhitao Gong, Sergi Caelles, Ross Hemsley, Gregory Thornton, Fangxiaoyu Feng, Wojciech Stokowiec, Ce Zheng, Phoebe Thacker, Çağlar Ünlü, Zhishuai Zhang, Mohammad Saleh, James Svensson, Max Bileschi, Piyush Patil, Ankesh Anand, Roman Ring, Katerina Tsihlas, Arpi Vezer, Marco Selvi, Toby Shevlane, Mikel Rodriguez, Tom Kwiatkowski, Samira Daruki, Keran Rong, Allan Dafoe, Nicholas FitzGerald, Keren Gu-Lemberg, Mina Khan, Lisa Anne Hendricks, Marie Pellat, Vladimir Feinberg, James Cobon-Kerr, Tara Sainath, Maribeth Rauh, Sayed Hadi Hashemi, Richard Ives, Yana Hasson, Eric Noland, Yuan Cao, Nathan Byrd, Le Hou, Qingze Wang, Thibault Sottiaux, Michela Paganini, Jean-Baptiste Lespiau, Alexandre Moufarek, Samer Hassan, Kaushik Shivakumar, Joost van Amersfoort, Amol Mandhane, Pratik Joshi, Anirudh Goyal, Matthew Tung, Andrew Brock, Hannah Sheahan, Vedant Misra, Cheng Li, Nemanja Rakićević, Mostafa Dehghani, Fangyu Liu, Sid Mittal, Junhyuk Oh, Seb Noury, Eren Sezener, Fantine Huot, Matthew Lamm, Nicola De Cao, Charlie Chen, Sidharth Mudgal, Romina Stella, Kevin Brooks, Gautam Vasudevan, Chenxi Liu, Mainak Chain, Nivedita Melinkeri, Aaron Cohen, Venus Wang, Kristie Seymore, Sergey Zubkov, Rahul Goel, Summer Yue, Sai Krishnakumaran, Brian Albert, Nate Hurley, Motoki Sano, Anhad Mohananey, Jonah Joughin, Egor Filonov, Tomasz Kępa, Yomna Eldawy, Jiawern Lim, Rahul Rishi, Shirin Badiezadegan, Taylor Bos, Jerry Chang, Sanil Jain, Sri Gayatri Sundara Padmanabhan, Subha Puttagunta, Kalpesh Krishna, Leslie Baker, Norbert Kalb, Vamsi Bedapudi, Shuntong Lei, Anthony Yu, Oren Litvin, Xiang Zhou, Zhichun Wu, Sam Sobell, Andrea Siciliano, Alan Papir, Robby Neale, Jonas Bragagnolo, Tej Toor, Tina Chen, Valentin Anklin, Feiran Wang, Richie Feng, Milad Gholami, Kevin Ling, Lijuan Liu, Jules Walter, Hamid Moghaddam, Arun Kishore, Jakub Adamek, Tyler Mercado, Jonathan Mallinson, Siddhinita Wandekar, Stephen Cagle, Eran Ofek, Guillermo Garrido, Clemens Lombriser, Maksim Mukha, Botu Sun, Hafeezul Rahman Mohammad, Josip Matak, Yadi Qian, Vikas Peswani, Pawel Janus, Quan Yuan, Leif Schelin, Oana David, Ankur Garg, Yifan He, Oleksii Duzhyi, Anton Älgmyr, Timothée Lottaz, Qi Li, Vikas Yadav, Luyao Xu, Alex Chinien, Rakesh Shivanna, Aleksandr Chuklin, Josie Li, Carrie Spadine, Travis Wolfe, Kareem Mohamed, Subhabrata Das, Zihang Dai, Kyle He, Daniel von Dincklage, Shyam Upadhyay, Akanksha Maurya, Luyan Chi, Sebastian Krause, Khalid Salama, Pam G Rabinovitch, Pavan Kumar Reddy M, Aarush Selvan, Mikhail Dektiarev, Golnaz Ghiasi, Erdem Guven, Himanshu Gupta, Boyi Liu, Deepak Sharma, Idan Heimlich Shtacher, Shachi Paul, Oscar Akerlund, François-Xavier Aubet, Terry Huang, Chen Zhu, Eric Zhu, Elico Teixeira, Matthew Fritze, Francesco Bertolini, Liana-Eleonora Marinescu, Martin Bölle, Dominik Paulus, Khyatti Gupta, Tejasi Latkar, Max Chang, Jason Sanders, Roopa Wilson, Xuewei Wu, Yi-Xuan Tan, Lam Nguyen Thiet, Tulsee Doshi, Sid Lall, Swaroop Mishra, Wanming Chen, Thang Luong, Seth Benjamin, Jasmine Lee, Ewa Andrejczuk, Dominik Rabiej, Vipul Ranjan, Krzysztof Styrc, Pengcheng Yin, Jon Simon, Malcolm Rose Harriott, Mudit Bansal, Alexei Robsky, Geoff Bacon, David Greene, Daniil Mirylenka, Chen Zhou, Obaid Sarvana, Abhimanyu Goyal, Samuel Andermatt, Patrick Siegler, Ben Horn, Assaf Israel, Francesco Pongetti, Chih-Wei "Louis" Chen, Marco Selvatici, Pedro Silva, Kathie Wang, Jackson Tolins, Kelvin Guu, Roey Yogev, Xiaochen Cai, Alessandro Agostini, Maulik Shah, Hung Nguyen, Noah Ó Donnaile, Sébastien Pereira, Linda Friso, Adam Stambler, Adam Kurzrok, Chenkai Kuang, Yan Romanikhin, Mark Geller, ZJ Yan, Kane Jang, Cheng-Chun Lee, Wojciech Fica, Eric Malmi, Qijun Tan, Dan Banica, Daniel Balle, Ryan Pham, Yanping Huang, Diana Avram, Hongzhi Shi, Jasjot Singh, Chris Hidey, Niharika Ahuja, Pranab Saxena, Dan Dooley, Srividya Pranavi Potharaju, Eileen O'Neill, Anand Gokulchandran, Ryan Foley, Kai Zhao, Mike Dusenberry, YuAn Liu, Pulkit Mehta, Ragha Kotikalapudi, Chalence Safranek-Shrader, Andrew Goodman, Joshua Kessinger, Eran Globen, Prateek Kolhar, Chris Gorgolewski, Ali Ibrahim, Yang song, Ali Eichenbaum, Thomas Brovelli, Sahitya Potluri, Preethi Lahoti, Cip Baetu, Ali Ghorbani, Charles Chen, Andy Crawford, Shalini Pal, Mukund Sridhar, Petru Gurita, Asier Mujika, Igor Petrovski, Pierre-Louis Cedoz, Chenmei Li, Shiyuan Chen, Niccolò Dal Santo, Siddharth Goyal, Jitesh Punjabi, Karthik Kappaganthu, Chester Kwak, Pallavi LV, Sarmishta Velury, Himadri Choudhury, Jamie Hall, Premal Shah, Ricardo Figueira, Matt Thomas, Minjie Lu, Ting Zhou, Chintu Kumar, Thomas Jurdi, Sharat Chikkerur, Yenai Ma, Adams Yu, Soo Kwak, Victor Ähdel, Sujeevan Rajayogam, Travis Choma, Fei Liu, Aditya Barua, Colin Ji, Ji Ho Park, Vincent Hellendoorn, Alex Bailey, Taylan Bilal, Huanjie Zhou, Mehrdad Khatir, Charles Sutton, Wojciech Rzadkowski, Fiona Macintosh, Konstantin Shagin, Paul Medina, Jinjing Zhou, Pararth Shah, Yingying Bi, Attila Dankovics, Shipra Banga, Sabine Lehmann, Marissa Bredesen, Zifan Lin, John Eric Hoffmann, Jonathan Lai, Raynald Chung, Kai Yang, Nihal Balani, Arthur Bražinskas, Andrei Sozanschi, Matthew Hayes, Héctor Fernández Alcalde, Peter Makarov, Will Chen, Antonio Stella, Liselotte Snijders, Michael Mandl, Ante Kärrman, Paweł Nowak, Xinyi Wu, Alex Dyck, Krishnan Vaidyanathan, Raghavender R, Jessica Mallet, Mitch Rudominer, Eric Johnston, Sushil Mittal, Akhil Udathu, Janara Christensen, Vishal Verma, Zach Irving, Andreas Santucci, Gamaleldin Elsayed, Elnaz Davoodi, Marin Georgiev, Ian Tenney, Geoffrey Cideron, Edouard Leurent, Mahmoud Alnahlawi, Ionut Georgescu, Nan Wei, Ivy Zheng, Dylan Scandinaro, Heinrich Jiang, Jasper Snoek, Mukund Sundararajan, Xuezhi Wang, Zack Ontiveros, Itay Karo, Jeremy Cole, Vinu Rajashekhar, Lara Tumeh, Eyal Ben-David, Rishub Jain, Jonathan Uesato, Romina Datta, Oskar Bunyan, Shimu Wu, John Zhang, Piotr Stanczyk, Ye Zhang, David Steiner, Subhajit Naskar, Michael Azzam, Matthew Johnson, Adam Paszke, Chung-Cheng Chiu, Jaume Sanchez Elias, Afroz Mohiuddin, Faizan Muhammad, Jin Miao, Andrew Lee, Nino Vieillard, Jane Park, Jiageng Zhang, Jeff Stanway, Drew Garmon, Abhijit Karmarkar, Zhe Dong, Jong Lee, Aviral Kumar, Luowei Zhou, Jonathan Evens, William Isaac, Geoffrey Irving, Edward Loper, Michael Fink, Isha Arkatkar, Nanxin Chen, Izhak Shafran, Ivan Petrychenko, Zhe Chen, Johnson Jia, Anselm Levskaya, Zhenkai Zhu, Peter Grabowski, Yu Mao, Alberto Magni, Kaisheng Yao, Javier Snaider, Norman Casagrande, Evan Palmer, Paul Suganthan, Alfonso Castaño, Irene Giannoumis, Wooyeol Kim, Mikołaj Rybiński, Ashwin Sreevatsa, Jennifer Prendki, David Soergel, Adrian Goedeckemeyer, Willi Gierke, Mohsen Jafari, Meenu Gaba, Jeremy Wiesner, Diana Gage Wright, Yawen Wei, Harsha Vashisht, Yana Kulizhskaya, Jay Hoover, Maigo Le, Lu Li, Chimezie Iwuanyanwu, Lu Liu, Kevin Ramirez, Andrey Khorlin, Albert Cui, Tian Lin, Marcus Wu, Ricardo Aguilar, Keith Pallo, Abhishek Chakladar, Ginger Perng, Elena Allica Abellan, Mingyang Zhang, Ishita Dasgupta, Nate Kushman, Ivo Penchev, Alena Repina, Xihui Wu, Tom van der Weide, Priya Ponnapalli, Caroline Kaplan, Jiri Simsa, Shuangfeng Li, Olivier Dousse, Jeff Piper, Nathan Ie, Rama Pasumarthi, Nathan Lintz, Anitha Vijayakumar, Daniel Andor, Pedro Valenzuela, Minnie Lui, Cosmin Paduraru, Daiyi Peng, Katherine Lee, Shuyuan Zhang, Somer Greene, Duc Dung Nguyen, Paula Kurylowicz, Cassidy Hardin, Lucas Dixon, Lili Janzer, Kiam Choo, Ziqiang Feng, Biao Zhang, Achintya Singhal, Dayou Du, Dan McKinnon, Natasha Antropova, Tolga Bolukbasi, Orgad Keller, David Reid, Daniel Finchelstein, Maria Abi Raad, Remi Crocker, Peter Hawkins, Robert Dadashi, Colin Gaffney, Ken Franko, Anna Bulanova, Rémi Leblond, Shirley Chung, Harry Askham, Luis C. Cobo, Kelvin Xu, Felix Fischer, Jun Xu, Christina Sorokin, Chris Alberti, Chu-Cheng Lin, Colin Evans, Alek Dimitriev, Hannah Forbes, Dylan Banarse, Zora Tung, Mark Omernick, Colton Bishop, Rachel Sterneck, Rohan Jain, Jiawei Xia, Ehsan Amid, Francesco Piccinno, Xingyu Wang, Praseem Banzal, Daniel J. Mankowitz, Alex Polozov, Victoria Krakovna, Sasha Brown, Mohammadhossein Bateni, Dennis Duan, Vlad Firoiu, Meghana Thotakuri, Tom Natan, Matthieu Geist, Ser tan Girgin, Hui Li, Jiayu Ye, Ofir Roval, Reiko Tojo, Michael Kwong, James Lee-Thorp, Christopher Yew, Danila Sinopalnikov, Sabela Ramos, John Mellor, Abhishek Sharma, Kathy Wu, David Miller, Nicolas Sonnerat, Denis Vnukov, Rory Greig, Jennifer Beattie, Emily Caveness, Libin Bai, Julian Eisenschlos, Alex Korchemniy, Tomy Tsai, Mimi Jasarevic, Weize Kong, Phuong Dao, Zeyu Zheng, Frederick Liu, Fan Yang, Rui Zhu, Tian Huey Teh, Jason Sanmiya, Evgeny Gladchenko, Nejc Trdin, Daniel Toyama, Evan Rosen, Sasan Tavakkol, Linting Xue, Chen Elkind, Oliver Woodman, John Carpenter, George Papamakarios, Rupert Kemp, Sushant Kafle, Tanya Grunina, Rishika Sinha, Alice Talbert, Diane Wu, Denese Owusu-Afriyie, Cosmo Du, Chloe Thornton, Jordi Pont-Tuset, Pradyumna Narayana, Jing Li, Saaber Fatehi, John Wieting, Omar Ajmeri, Benigno Uria, Yeongil Ko, Laura Knight, Amélie Héliou, Ning Niu, Shane Gu, Chenxi Pang, Yeqing Li, Nir Levine, Ariel Stolovich, Rebeca Santamaria-Fernandez, Sonam Goenka, Wenny Yustalim, Robin Strudel, Ali Elqursh, Charlie Deck, Hyo Lee, Zonglin Li, Kyle Levin, Raphael Hoffmann, Dan Holtmann-Rice, Olivier Bachem, Sho Arora, Christy Koh, Soheil Hassas Yeganeh, Siim Põder, Mukarram Tariq, Yanhua Sun, Lucian Ionita, Mojtaba Seyedhosseini, Pouya Tafti, Zhiyu Liu, Anmol Gulati, Jasmine Liu, Xinyu Ye, Bart Chrzaszcz, Lily Wang, Nikhil Sethi, Tianrun Li, Ben Brown, Shreya Singh, Wei Fan, Aaron Parisi, Joe Stanton, Vinod Koverkathu, Christopher A. Choquette-Choo, Yunjie Li, TJ Lu, Abe Ittycheriah, Prakash Shroff, Mani Varadarajan, Sanaz Bahargam, Rob Willoughby, David Gaddy, Guillaume Desjardins, Marco Cornero, Brona Robenek, Bhavishya Mittal, Ben Albrecht, Ashish Shenoy, Fedor Moiseev, Henrik Jacobsson, Alireza Ghaffarkhah, Morgane Rivière, Alanna Walton, Clément Crepy, Alicia Parrish, Zongwei Zhou, Clement Farabet, Carey Radebaugh, Praveen Srinivasan, Claudia van der Salm, Andreas Fidjeland, Salvatore Scellato, Eri Latorre-Chimoto, Hanna Klimczak-Plucińska, David Bridson, Dario de Cesare, Tom Hudson, Piermaria Mendolicchio, Lexi Walker, Alex Morris, Matthew Mauger, Alexey Guseynov, Alison Reid, Seth Odoom, Lucia Loher, Victor Cotruta, Madhavi Yenugula, Dominik Grewe, Anastasia Petrushkina, Tom Duerig, Antonio Sanchez, Steve Yadlowsky, Amy Shen, Amir Globerson, Lynette Webb, Sahil Dua, Dong Li, Surya Bhupatiraju, Dan Hurt, Haroon Qureshi, Ananth Agarwal, Tomer Shani, Matan Eyal, Anuj Khare, Shreyas Rammohan Belle, Lei Wang, Chetan Tekur, Mihir Sanjay Kale, Jinliang Wei, Ruoxin Sang, Brennan Saeta, Tyler Liechty, Yao Zhao, Stephan Lee, Pandu Nayak, Doug Fritz, Manish Reddy Vuyyuru, John Aslanides, Nidhi Vyas, Martin Wicke, Xiao Ma, Evgenii Eltyshev, Nina Martin, Hardie Cate, James Manyika, Keyvan Amiri, Yelin Kim, Xi Xiong, Kai Kang, Florian Luisier, Nilesh Tripuraneni, David Madras, Mandy Guo, Austin Waters, Oliver Wang, Joshua Ainslie, Jason Baldridge, Han Zhang, Garima Pruthi, Jakob Bauer, Feng Yang, Riham Mansour, Jason Gelman, Yang Xu, George Polovets, Ji Liu, Honglong Cai, Warren Chen, XiangHai Sheng, Emily Xue, Sherjil Ozair, Christof Angermueller, Xiaowei Li, Anoop Sinha, Weiren Wang, Julia Wiesinger, Emmanouil Koukoumidis, Yuan Tian, Anand Iyer, Madhu Gurumurthy, Mark Goldenson, Parashar Shah, MK Blake, Hongkun Yu, Anthony Urbanowicz, Jennimaria Palomaki, Chrisantha Fernando, Ken Durden, Harsh Mehta, Nikola Momchev, Elahe Rahimtoroghi, Maria Georgaki, Amit Raul, Sebastian Ruder, Morgan Redshaw, Jinhyuk Lee, Denny Zhou, Komal Jalan, Dinghua Li, Blake Hechtman, Parker Schuh, Milad Nasr, Kieran Milan, Vladimir Mikulik, Juliana Franco, Tim Green, Nam Nguyen, Joe Kelley, Aroma Mahendru, Andrea Hu, Joshua Howland, Ben Vargas, Jeffrey Hui, Kshitij Bansal, Vikram Rao, Rakesh Ghiya, Emma Wang, Ke Ye, Jean Michel Sarr, Melanie Moranski Preston, Madeleine Elish, Steve Li, Aakash Kaku, Jigar Gupta, Ice Pasupat, Da-Cheng Juan, Milan Someswar, Tejvi M., Xinyun Chen, Aida Amini, Alex Fabrikant, Eric Chu, Xuanyi Dong, Amruta Muthal, Senaka Buthpitiya, Sarthak Jauhari, Nan Hua, Urvashi Khandelwal, Ayal Hitron, Jie Ren, Larissa Rinaldi, Shahar Drath, Avigail Dabush, Nan-Jiang Jiang, Harshal Godhia, Uli Sachs, Anthony Chen, Yicheng Fan, Hagai Taitelbaum, Hila Noga, Zhuyun Dai, James Wang, Chen Liang, Jenny Hamer, Chun-Sung Ferng, Chenel Elkind, Aviel Atias, Paulina Lee, Vít Listík, Mathias Carlen, Jan van de Kerkhof, Marcin Pikus, Krunoslav Zaher, Paul Müller, Sasha Zykova, Richard Stefanec, Vitaly Gatsko, Christoph Hirnschall, Ashwin Sethi, Xingyu Federico Xu, Chetan Ahuja, Beth Tsai, Anca Stefanoiu, Bo Feng, Keshav Dhandhania, Manish Katyal, Akshay Gupta, Atharva Parulekar, Divya Pitta, Jing Zhao, Vivaan Bhatia, Yashodha Bhavnani, Omar Alhadlaq, Xiaolin Li, Peter Danenberg, Dennis Tu, Alex Pine, Vera Filippova, Abhipso Ghosh, Ben Limonchik, Bhargava Urala, Chaitanya Krishna Lanka, Derik Clive, Yi Sun, Edward Li, Hao Wu, Kevin Hongtongsak, Ianna Li, Kalind Thakkar, Kuanysh Omarov, Kushal Majmundar, Michael Alverson, Michael Kucharski, Mohak Patel, Mudit Jain, Maksim Zabelin, Paolo Pelagatti, Rohan Kohli, Saurabh Kumar, Joseph Kim, Swetha Sankar, Vineet Shah, Lakshmi Ramachandruni, Xiangkai Zeng, Ben Bariach, Laura Weidinger, Amar Subramanya, Sissie Hsiao, Demis Hassabis, Koray Kavukcuoglu, Adam Sadovsky, Quoc Le, Trevor Strohman, Yonghui Wu, Slav Petrov, Jeffrey Dean, Oriol Vinyals

This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding.

Ranked #1 on Multi-task Language Understanding on MMLU (using extra training data)

Arithmetic Reasoning Code Generation +3

Paper
Add Code

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

no code implementations • 30 Dec 2023 • Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang

Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs).

Uncertainty Quantification

Paper
Add Code

Fundamental Limitation of Semantic Communications: Neural Estimation for Rate-Distortion

no code implementations • 2 Jan 2024 • Dongxu Li, Jianhao Huang, Chuan Huang, Xiaoqi Qin, Han Zhang, Ping Zhang

For the case with unknown semantic source distribution, while only a set of the source samples is available, we propose a neural-network-based method by leveraging the generative networks to learn the semantic source distribution.

Paper
Add Code

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

no code implementations • 11 Jan 2024 • Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang

Additionally, Parrot employs a joint optimization approach for the T2I model and the prompt expansion network, facilitating the generation of quality-aware text prompts, thus further enhancing the final image quality.

Reinforcement Learning (RL) Text-to-Image Generation

Paper
Add Code

CPPO: Continual Learning for Reinforcement Learning with Human Feedback

no code implementations • Conference 2024 • Han Zhang, Yu Lei, Lin Gui, Min Yang, Yulan He, Hui Wang, Ruifeng Xu

The approach of Reinforcement Learning from Human Feedback (RLHF) is widely used for enhancing pre-trained Language Models (LM), enabling them to better align with human preferences.

Continual Learning reinforcement-learning

Paper
Add Code

Federated Learning with Dual Attention for Robust Modulation Classification under Attacks

no code implementations • 19 Jan 2024 • Han Zhang, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci

To this end, we leverage attention mechanisms as a defense against attacks in FL and propose a robust FL algorithm by integrating the attention mechanisms into the global model aggregation step.

Data Poisoning Federated Learning

Paper
Add Code

Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion

no code implementations • 21 Feb 2024 • Lianghu Guo, Tianli Tao, Xinyi Cai, Zihao Zhu, Jiawei Huang, Lixuan Zhu, Zhuoyang Gu, Haifeng Tang, Rui Zhou, Siyan Han, Yan Liang, Qing Yang, Dinggang Shen, Han Zhang

Early infancy is a rapid and dynamic neurodevelopmental period for behavior and neurocognition.

Super-Resolution

Paper
Add Code

QIS : Interactive Segmentation via Quasi-Conformal Mappings

no code implementations • 22 Feb 2024 • Han Zhang, Daoping Zhang, Lok Ming Lui

In this paper, we propose the quasi-conformal interactive segmentation (QIS) model, which incorporates user input in the form of positive and negative clicks.

Image Segmentation Interactive Segmentation +2

Paper
Add Code

COPR: Continual Human Preference Learning via Optimal Policy Regularization

no code implementations • 22 Feb 2024 • Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu

Reinforcement Learning from Human Feedback (RLHF) is commonly utilized to improve the alignment of Large Language Models (LLMs) with human preferences.

Continual Learning

Paper
Add Code

MUC: Mixture of Uncalibrated Cameras for Robust 3D Human Body Reconstruction

no code implementations • 8 Mar 2024 • Yitao Zhu, Sheng Wang, Mengjie Xu, Zixu Zhuang, Zhixin Wang, Kaidong Wang, Han Zhang, Qian Wang

Next, instead of simply averaging models across views, we train a network to determine the weights of individual views for their fusion, based on the parameters estimated for joints and hands of human body as well as camera positions.

Paper
Add Code

FluoroSAM: A Language-aligned Foundation Model for X-ray Image Segmentation

1 code implementation • 12 Mar 2024 • Benjamin D. Killeen, Liam J. Wang, Han Zhang, Mehran Armand, Russell H. Taylor, Dave Dreizin, Greg Osgood, Mathias Unberath

Recently, foundation models (FMs) -- machine learning models trained on large amounts of highly variable data thus enabling broad applicability -- have emerged as promising tools for automated image analysis.

Image Segmentation Semantic Segmentation +1

Paper
Code

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

2 code implementations • 25 Mar 2024 • Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Liwei Huang, Xiaopeng Fan, Li Yuan, Zhengyu Ma, Huihui Zhou, Yonghong Tian

ii) We incorporate the hierarchical structure, which significantly benefits the performance of both the brain and artificial neural networks, into spiking transformers to obtain multi-scale spiking representation.

Paper
Code

Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion Networks

no code implementations • 29 Mar 2024 • Luoyu Wang, Yitian Tao, Qing Yang, Yan Liang, Siwei Liu, Hongcheng Shi, Dinggang Shen, Han Zhang

To fully exploit the inherent complex and nonlinear relation among modalities while producing fine-grained representations for uni-modal inference, we subsequently add a modal alignment module to line up a dominant modality (e. g., PET) with representations of auxiliary modalities (MR).

Paper
Add Code

Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation

no code implementations • 31 Mar 2024 • Yitian Tao, Liyan Ma, Jing Yu, Han Zhang

To ensure the semantic consistency of the retrieved cross modal prior knowledge, a cross-modal semantic alignment module (SAM) is proposed.

Paper
Add Code

A Multi-Level Framework for Accelerating Training Transformer Models

1 code implementation • 7 Apr 2024 • Longwei Zou, Han Zhang, Yangdong Deng

Specifically, the framework is based on three basic operators, Coalescing, De-coalescing and Interpolation, which can be orchestrated to build a multi-level training framework.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.