Search Results for author: Han Zhang

Found 149 papers, 64 papers with code

Multi-lingual Geoparsing based on Machine Translation

no code implementations6 Nov 2015 Xu Chen, Han Zhang, Judith Gelernter

Our results for geoparsing Chinese and Arabic text using our multi-lingual geoparsing method are comparable to our results for geoparsing English text with our English tools.

Machine Translation Translation

SegAN: Adversarial Network with Multi-scale $L_1$ Loss for Medical Image Segmentation

2 code implementations6 Jun 2017 Yuan Xue, Tao Xu, Han Zhang, Rodney Long, Xiaolei Huang

Extensive experimental results demonstrate the effectiveness of the proposed SegAN with multi-scale loss: on BRATS 2013 SegAN gives performance comparable to the state-of-the-art for whole tumor and tumor core segmentation while achieves better precision and sensitivity for Gd-enhance tumor core segmentation; on BRATS 2015 SegAN achieves better performance than the state-of-the-art in both dice score and precision.

Brain Tumor Segmentation Image Segmentation +2

Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision

no code implementations CVPR 2017 Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal

We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations.

Zero-Shot Learning

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

19 code implementations CVPR 2018 Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He

In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation.

Generative Adversarial Network Image-text matching +2

Improving GANs Using Optimal Transport

2 code implementations ICLR 2018 Tim Salimans, Han Zhang, Alec Radford, Dimitris Metaxas

We present Optimal Transport GAN (OT-GAN), a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution.

Image Generation

Self-Attention Generative Adversarial Networks

49 code implementations arXiv 2018 Han Zhang, Ian Goodfellow, Dimitris Metaxas, Augustus Odena

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks.

Conditional Image Generation Generative Adversarial Network

Deep Chronnectome Learning via Full Bidirectional Long Short-Term Memory Networks for MCI Diagnosis

no code implementations30 Aug 2018 Weizheng Yan, Han Zhang, Jing Sui, Dinggang Shen

Dynamic functional connectivity (dFC), consisting of time-varying spatiotemporal dynamics, may characterize "chronnectome" diagnostic information for improving MCI classification.

Clustering General Classification +1

A Unified Mammogram Analysis Method via Hybrid Deep Supervision

1 code implementation31 Aug 2018 Rongzhao Zhang, Han Zhang, Albert C. S. Chung

In this work, we present a unified mammogram analysis framework for both whole-mammogram classification and segmentation.

Classification General Classification +3

A Teacher-Student Framework for Maintainable Dialog Manager

no code implementations EMNLP 2018 Weikang Wang, Jiajun Zhang, Han Zhang, Mei-Yuh Hwang, Cheng-qing Zong, Zhifei Li

Specifically, the {``}student{''} is an extended dialog manager based on a new ontology, and the {``}teacher{''} is existing resources used for guiding the learning process of the {``}student{''}.

Reinforcement Learning (RL)

Multi-Antenna Channel Interpolation via Tucker Decomposed Extreme Learning Machine

no code implementations26 Dec 2018 Han Zhang, Bo Ai, Wenjun Xu, Li Xu, Shuguang Cui

Channel interpolation is an essential technique for providing high-accuracy estimation of the channel state information (CSI) for wireless systems design where the frequency-space structural correlations of multi-antenna channel are typically hidden in matrix or tensor forms.

Deep Learning for Signal Demodulation in Physical Layer Wireless Communications: Prototype Platform, Open Dataset, and Analytics

no code implementations8 Mar 2019 Hongmei Wang, Zhenzhen Wu, Shuai Ma, Songtao Lu, Han Zhang, Guoru Ding, Shiyin Li

In this paper, we investigate deep learning (DL)-enabled signal demodulation methods and establish the first open dataset of real modulated signals for wireless communication systems.

Brain Network Construction and Classification Toolbox (BrainNetClass)

1 code implementation17 Jun 2019 Zhen Zhou, Xiaobo Chen, Yu Zhang, Lishan Qiao, Renping Yu, Gang Pan, Han Zhang, Dinggang Shen

The goal of this work is to introduce a toolbox namely "Brain Network Construction and Classification" (BrainNetClass) to the field to promote more advanced brain network construction methods.

Classification General Classification

DynWalks: Global Topology and Recent Changes Awareness Dynamic Network Embedding

2 code implementations arXiv 2019 Chengbin Hou, Han Zhang, Ke Tang, Shan He

Dynamic network embedding aims to learn low dimensional embeddings for unseen and seen nodes by using any currently available snapshots of a dynamic network.

Graph Reconstruction Link Prediction +1

Approximation Capabilities of Neural ODEs and Invertible Residual Networks

no code implementations ICML 2020 Han Zhang, Xi Gao, Jacob Unterman, Tom Arodz

Neural ODEs and i-ResNet are recently proposed methods for enforcing invertibility of residual neural models.

Multiple instance dense connected convolution neural network for aerial image scene classification

no code implementations22 Aug 2019 Qi Bi, Kun Qin, Zhili Li, Han Zhang, Kai Xu

While the current convolution neural network tends to extract global features and global semantic information in a scene, the geo-spatial objects can be located at anywhere in an aerial image scene and their spatial arrangement tends to be more complicated.

General Classification Scene Classification

Building change detection based on multi-scale filtering and grid partition

no code implementations22 Aug 2019 Qi Bi, Kun Qin, Han Zhang, Wenjun Han, Zhili Li, Kai Xu

Exhaustive experiments indicate that the proposed method can detect building change types directly and outperform the current multi-index learning method.

Change Detection

Distributed Equivalent Substitution Training for Large-Scale Recommender Systems

no code implementations10 Sep 2019 Haidong Rong, Yangzihao Wang, Feihu Zhou, Junjie Zhai, Haiyang Wu, Rui Lan, Fan Li, Han Zhang, Yuekui Yang, Zhenyu Guo, Di Wang

We present Distributed Equivalent Substitution (DES) training, a novel distributed training framework for large-scale recommender systems with dynamic sparse features.

Recommendation Systems

Distilling Effective Supervision from Severe Label Noise

2 code implementations CVPR 2020 Zizhao Zhang, Han Zhang, Sercan O. Arik, Honglak Lee, Tomas Pfister

For instance, on CIFAR100 with a $40\%$ uniform noise ratio and only 10 trusted labeled data per class, our method achieves $80. 2{\pm}0. 3\%$ classification accuracy, where the error rate is only $1. 4\%$ higher than a neural network trained without label noise.

Image Classification

Differentiable Combinatorial Losses through Generalized Gradients of Linear Programs

no code implementations18 Oct 2019 Xi Gao, Han Zhang, Aliakbar Panahi, Tom Arodz

When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference.

Combinatorial Optimization Graph Matching +1

Small-GAN: Speeding Up GAN Training Using Core-sets

no code implementations ICML 2020 Samarth Sinha, Han Zhang, Anirudh Goyal, Yoshua Bengio, Hugo Larochelle, Augustus Odena

Recent work by Brock et al. (2018) suggests that Generative Adversarial Networks (GANs) benefit disproportionately from large mini-batch sizes.

Active Learning Anomaly Detection +1

Multimodal, Multilingual Grapheme-to-Phoneme Conversion for Low-Resource Languages

no code implementations WS 2019 James Route, Steven Hillis, Isak Czeresnia Etinger, Han Zhang, Alan W. black

Grapheme-to-phoneme conversion (g2p) is the task of predicting the pronunciation of words from their orthographic representation.

MANELA: A Multi-Agent Algorithm for Learning Network Embeddings

no code implementations1 Dec 2019 Han Zhang, Hong Xu

On the other hand, learning network embeddings on distributively stored networks still remained understudied: To the best of our knowledge, all existing algorithms for learning network embeddings have hitherto been exclusively centralized and thus cannot be applied to these networks.

BIG-bench Machine Learning Network Embedding

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

4 code implementations26 Jan 2020 Dongling Xiao, Han Zhang, Yukun Li, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks.

 Ranked #1 on Question Generation on SQuAD1.1 (using extra training data)

Abstractive Text Summarization Dialogue Generation +3

Improved Consistency Regularization for GANs

no code implementations11 Feb 2020 Zhengli Zhao, Sameer Singh, Honglak Lee, Zizhao Zhang, Augustus Odena, Han Zhang

Recent work has increased the performance of Generative Adversarial Networks (GANs) by enforcing a consistency cost on the discriminator.

Image Generation

Solving Missing-Annotation Object Detection with Background Recalibration Loss

2 code implementations12 Feb 2020 Han Zhang, Fangyi Chen, Zhiqiang Shen, Qiqi Hao, Chenchen Zhu, Marios Savvides

In this paper, we introduce a superior solution called Background Recalibration Loss (BRL) that can automatically re-calibrate the loss signals according to the pre-defined IoU threshold and input image.

Object object-detection +1

ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring

1 code implementation ICLR 2020 David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel

We improve the recently-proposed ``MixMatch semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring.

A Simple Semi-Supervised Learning Framework for Object Detection

7 code implementations10 May 2020 Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, Tomas Pfister

Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine learning models using unlabeled data.

Ranked #13 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Data Augmentation Image Classification +4

Towards Personalized and Semantic Retrieval: An End-to-End Solution for E-commerce Search via Embedding Learning

no code implementations3 Jun 2020 Han Zhang, Songlin Wang, Kang Zhang, Zhiling Tang, Yunjiang Jiang, Yun Xiao, Weipeng Yan, Wen-Yun Yang

Two critical challenges stay in today's e-commerce search: how to retrieve items that are semantically relevant but not exact matching to query terms, and how to retrieve items that are more personalized to different users for the same search query.

Retrieval Semantic Retrieval

Image Augmentations for GAN Training

no code implementations4 Jun 2020 Zhengli Zhao, Zizhao Zhang, Ting Chen, Sameer Singh, Han Zhang

We provide new state-of-the-art results for conditional generation on CIFAR-10 with both consistency loss and contrastive loss as additional regularizations.

Image Augmentation Image Generation

A Hybrid Evolutionary Algorithm for Reliable Facility Location Problem

no code implementations27 Jun 2020 Han Zhang, Jialin Liu, Xin Yao

The reliable facility location problem (RFLP) is an important research topic of operational research and plays a vital role in the decision-making and management of modern supply chain and logistics.

Decision Making Management

From Spectrum Wavelet to Vertex Propagation: Graph Convolutional Networks Based on Taylor Approximation

no code implementations1 Jul 2020 Songyang Zhang, Han Zhang, Shuguang Cui, Zhi Ding

Graph convolutional networks (GCN) have been recently utilized to extract the underlying structures of datasets with some labeled data and high-dimensional features.

Node Classification

Improving NER's Performance with Massive financial corpus

1 code implementation31 Jul 2020 Han Zhang

Training large deep neural networks needs massive high quality annotation data, but the time and labor costs are too expensive for small business.

Language Modelling

GloDyNE: Global Topology Preserving Dynamic Network Embedding

2 code implementations5 Aug 2020 Chengbin Hou, Han Zhang, Shan He, Ke Tang

The main and common objective of Dynamic Network Embedding (DNE) is to efficiently update node embeddings while preserving network topology at each time step.

Graph Reconstruction Incremental Learning +1

Co-evolution of Functional Brain Network at Multiple Scales during Early Infancy

no code implementations15 Sep 2020 Xuyun Wen, Liming Hsu, Weili Lin, Han Zhang, Dinggang Shen

By applying our proposed methodological framework on the collected longitudinal infant dataset, we provided the first evidence that, in the first 2 years of life, the brain functional network is co-evolved at different scales, where each scale displays the unique reconfiguration pattern in terms of modular organization.

Transfer learning of chaotic systems

no code implementations15 Nov 2020 Yali Guo, Han Zhang, Liang Wang, Huawei Fan, Xingang Wang

Here we investigate transfer learning of chaotic systems from the perspective of synchronization-based state inference, in which a reservoir computer trained by chaotic system A is used to infer the unmeasured variables of chaotic system B, while A is different from B in either parameter or dynamics.

General Knowledge Time Series +2

Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

1 code implementation15 Dec 2020 Han Zhang, Wenhao Zheng, Charley Chen, Kevin Gao, Yao Hu, Ling Huang, Wei Xu

Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions.

Anomaly Detection Fraud Detection

A Multiscale Graph Convolutional Network for Change Detection in Homogeneous and Heterogeneous Remote Sensing Images

no code implementations16 Feb 2021 Junzheng Wu, Biao Li, Yao Qin, Weiping Ni, Han Zhang, Yuli Sun

In this paper, a novel CD method based on the graph convolutional network (GCN) and multiscale object-based technique is proposed for both homogeneous and heterogeneous images.

Change Detection

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction

1 code implementation ICLR 2021 Wonkwang Lee, Whie Jung, Han Zhang, Ting Chen, Jing Yu Koh, Thomas Huang, Hyungsuk Yoon, Honglak Lee, Seunghoon Hong

Despite the recent advances in the literature, existing approaches are limited to moderately short-term prediction (less than a few seconds), while extrapolating it to a longer future quickly leads to destruction in structure and content.

Translation Video Prediction

Transfer training from smaller language model

no code implementations23 Apr 2021 Han Zhang

We initialize a larger target model from a smaller source model by copy weight values from source model and padding with zeros or small initialization values on it to make the source and target model have approximate outputs, which is valid due to block matrix multiplication and residual connection in transformer structure.

Language Modelling Large Language Model +1

Learning Hamiltonian dynamics by reservoir computer

no code implementations24 Apr 2021 Han Zhang, Huawei Fan, Liang Wang, Xingang Wang

Reconstructing the KAM dynamics diagram of Hamiltonian system from the time series of a limited number of parameters is an outstanding question in nonlinear science, especially when the Hamiltonian governing the system dynamics are unknown.

Time Series Time Series Analysis

Joint Learning of Deep Retrieval Model and Product Quantization based Embedding Index

1 code implementation9 May 2021 Han Zhang, Hongwei Shen, Yiming Qiu, Yunjiang Jiang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang

Embedding index that enables fast approximate nearest neighbor(ANN) search, serves as an indispensable component for state-of-the-art deep retrieval systems.

Quantization Retrieval

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

6 code implementations26 May 2021 Zizhao Zhang, Han Zhang, Long Zhao, Ting Chen, Sercan O. Arik, Tomas Pfister

Hierarchical structures are popular in recent vision transformers, however, they require sophisticated designs and massive datasets to work well.

Image Classification Image Generation

Improved Transformer for High-Resolution GANs

1 code implementation NeurIPS 2021 Long Zhao, Zizhao Zhang, Ting Chen, Dimitris N. Metaxas, Han Zhang

Attention-based models, exemplified by the Transformer, can effectively model long range dependency, but suffer from the quadratic complexity of self-attention operation, making them difficult to be adopted for high-resolution image generation based on Generative Adversarial Networks (GANs).

Ranked #2 on Image Generation on CelebA 256x256 (FID metric)

Image Generation Vocal Bursts Intensity Prediction

SearchGCN: Powering Embedding Retrieval by Graph Convolution Networks for E-Commerce Search

no code implementations1 Jul 2021 Xinlin Xia, Shang Wang, Han Zhang, Songlin Wang, Sulong Xu, Yun Xiao, Bo Long, Wen-Yun Yang

Graph convolution networks (GCN), which recently becomes new state-of-the-art method for graph node classification, recommendation and other applications, has not been successfully applied to industrial-scale search engine yet.

Node Classification Retrieval

ViTGAN: Training GANs with Vision Transformers

3 code implementations ICLR 2022 Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

Recently, Vision Transformers (ViTs) have shown competitive performance on image recognition while requiring less vision-specific inductive biases.

Image Generation

Deep Image Synthesis from Intuitive User Input: A Review and Perspectives

no code implementations9 Jul 2021 Yuan Xue, Yuan-Chen Guo, Han Zhang, Tao Xu, Song-Hai Zhang, Xiaolei Huang

In many applications of computer graphics, art and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph or layout, and have a computer system automatically generate photo-realistic images that adhere to the input content.

Image Generation Image Retrieval +1

DeepAID: Interpreting and Improving Deep Learning-based Anomaly Detection in Security Applications

1 code implementation23 Sep 2021 Dongqi Han, Zhiliang Wang, Wenqi Chen, Ying Zhong, Su Wang, Han Zhang, Jiahai Yang, Xingang Shi, Xia Yin

Experimental results show that DeepAID can provide high-quality interpretations for unsupervised DL models while meeting the special requirements of security domains.

Anomaly Detection

Vector-quantized Image Modeling with Improved VQGAN

5 code implementations ICLR 2022 Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu

Motivated by this success, we explore a Vector-quantized Image Modeling (VIM) approach that involves pretraining a Transformer to predict rasterized image tokens autoregressively.

Image Generation Representation Learning +1

You Ought to Look Around: Precise, Large Span Action Detection

no code implementations 25th International Conference on Pattern Recognition (ICPR) 2021 Ge Pan, Han Zhang, Fan Yu, Yonghong Song, Yuanlin Zhang, Han Yuan

In this paper, we propose a method called YOLA (You Ought to Look Around) which includes three parts: 1) a robust backbone SPN-I3D for extracting spatio-temporal features.

Action Detection Action Localization

BLT: Bidirectional Layout Transformer for Controllable Layout Generation

1 code implementation9 Dec 2021 Xiang Kong, Lu Jiang, Huiwen Chang, Han Zhang, Yuan Hao, Haifeng Gong, Irfan Essa

During inference, BLT first generates a draft layout from the input and then iteratively refines it into a high-quality layout by masking out low-confident attributes.

Temporal Transformer Networks with Self-Supervision for Action Recognition

no code implementations14 Dec 2021 Yongkang Zhang, Jun Li, Guoming Wu, Han Zhang, Zhiping Shi, Zhaoxun Liu, Zizhang Wu, Na Jiang

The temporal sequence self-supervision module we employ unprecedentedly adopts the streamlined strategy of "random batch random channel" to reverse the sequence of video frames, allowing robust extractions of motion information representation from inversed temporal dimensions and improving the generalization capability of the model.

Action Recognition Temporal Action Localization

Learning to Prompt for Continual Learning

4 code implementations CVPR 2022 Zifeng Wang, Zizhao Zhang, Chen-Yu Lee, Han Zhang, Ruoxi Sun, Xiaoqi Ren, Guolong Su, Vincent Perot, Jennifer Dy, Tomas Pfister

The mainstream paradigm behind continual learning has been to adapt the model parameters to non-stationary data distributions, where catastrophic forgetting is the central challenge.

Class Incremental Learning Image Classification

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

2 code implementations31 Dec 2021 Han Zhang, Weichong Yin, Yewei Fang, Lanxin Li, Boqiang Duan, Zhihua Wu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

To explore the landscape of large-scale pre-training for bidirectional text-image generation, we train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs which achieves state-of-the-art performance for both text-to-image and image-to-text tasks, obtaining an FID of 7. 9 on MS-COCO for text-to-image synthesis and best results on COCO-CN and AIC-ICC for image captioning.

Image Captioning Quantization +2

MAXIM: Multi-Axis MLP for Image Processing

1 code implementation CVPR 2022 Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks.

Deblurring Image Deblurring +6

MaskGIT: Masked Generative Image Transformer

6 code implementations CVPR 2022 Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman

At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

Image Manipulation Image Outpainting +1

StyleBERT: Chinese pretraining by font style information

no code implementations21 Feb 2022 Chao Lv, Han Zhang, Xinkai Du, Yunhao Zhang, Ying Huang, Wenhao Li, Jia Han, Shanshan Gu

With the success of down streaming task using English pre-trained language model, the pre-trained Chinese language model is also necessary to get a better performance of Chinese NLP task.

Language Modelling

Topology-Preserving Segmentation Network: A Deep Learning Segmentation Framework for Connected Component

no code implementations27 Feb 2022 Han Zhang, Lok Ming Lui

TPSN is a deformation-based model that yields a deformation map through a UNet, which takes the medical image and a template mask as inputs.

Image Segmentation Medical Image Segmentation +2

Givens Coordinate Descent Methods for Rotation Matrix Learning in Trainable Embedding Indexes

no code implementations ICLR 2022 Yunjiang Jiang, Han Zhang, Yiming Qiu, Yun Xiao, Bo Long, Wen-Yun Yang

Product quantization (PQ) coupled with a space rotation, is widely used in modern approximate nearest neighbor (ANN) search systems to significantly compress the disk storage for embeddings and speed up the inner product computation.

Quantization

Unitail: Detecting, Reading, and Matching in Retail Scene

no code implementations1 Apr 2022 Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides

To make full use of computer vision technology in stores, it is required to consider the actual needs that fit the characteristics of the retail scene.

Benchmarking Dense Object Detection +2

MaxViT: Multi-Axis Vision Transformer

14 code implementations4 Apr 2022 Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, Yinxiao Li

We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module.

Image Classification object-detection +1

Powering Finetuning in Few-Shot Learning: Domain-Agnostic Bias Reduction with Selected Sampling

no code implementations7 Apr 2022 Ran Tao, Han Zhang, Yutong Zheng, Marios Savvides

Class-agnostic bias is defined as the distribution shifting introduced by domain difference, which we propose Distribution Calibration Module(DCM) to reduce.

Few-Shot Learning

Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder

1 code implementation6 May 2022 Zhenghao Liu, Han Zhang, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Xiaohua LI

These embeddings need to be high-dimensional to fit training signals and guarantee the retrieval effectiveness of dense retrievers.

Dimensionality Reduction Information Retrieval +1

Faithful Explanations for Deep Graph Models

no code implementations24 May 2022 Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the nonlinear effect of edge features, while existing subgraph explanation methods are not faithful.

Anomaly Detection

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

2 code implementations22 Jun 2022 Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, ZiRui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu

We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge.

Machine Translation Text-to-Image Generation +1

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

1 code implementation12 Aug 2022 Yiming Qiu, Chenyu Zhao, Han Zhang, Jingwei Zhuo, TianHao Li, Xiaowei Zhang, Songlin Wang, Sulong Xu, Bo Long, Wen-Yun Yang

BERT-style models pre-trained on the general corpus (e. g., Wikipedia) and fine-tuned on specific task corpus, have recently emerged as breakthrough techniques in many NLP tasks: question answering, text classification, sequence labeling and so on.

Intent Detection Question Answering +3

Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

no code implementations15 Aug 2022 Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Xiang Li, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides

Beyond classification, Conv-Adapter can generalize to detection and segmentation tasks with more than 50% reduction of parameters but comparable performance to the traditional full fine-tuning.

Transfer Learning

Visual Prompt Tuning for Generative Transfer Learning

1 code implementation CVPR 2023 Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang

We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers.

Image Generation Transfer Learning +1

Towards Semi-automatic Detection and Localization of Indoor Accessibility Issues using Mobile Depth Scanning and Computer Vision

no code implementations5 Oct 2022 Xia Su, Kaiming Cheng, Han Zhang, Jaewook Lee, Jon E. Froehlich

To help improve the safety and accessibility of indoor spaces, researchers and health professionals have created assessment instruments that enable homeowners and trained experts to audit and improve homes.

Topology-Preserving Segmentation Network

no code implementations7 Oct 2022 Han Zhang, Lok Ming Lui

Comparing to the segmentation framework based on pixel-wise classification, deformation-based segmentation models that warp a template to enclose the regions are more convenient to enforce geometric constraints.

Image Segmentation Medical Image Segmentation +2

Using Language to Extend to Unseen Domains

1 code implementation18 Oct 2022 Lisa Dunlap, Clara Mohri, Devin Guillory, Han Zhang, Trevor Darrell, Joseph E. Gonzalez, aditi raghunathan, Anja Rohrbach

It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.

Domain Adaptation

Cost Splitting for Multi-Objective Conflict-Based Search

no code implementations23 Nov 2022 Cheng Ge, Han Zhang, Jiaoyang Li, Sven Koenig

Our theoretical results show that, when combined with either of these two new splitting strategies, MO-CBS maintains its completeness and optimality guarantees.

Multi-Agent Path Finding

Dimensionality-Varying Diffusion Process

no code implementations CVPR 2023 Han Zhang, Ruili Feng, Zhantao Yang, Lianghua Huang, Yu Liu, Yifei Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension.

Image Generation

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation

1 code implementation ICCV 2023 Ronghui Li, Junfan Zhao, Yachao Zhang, Mingyang Su, Zeping Ren, Han Zhang, Yansong Tang, Xiu Li

To address these problems, we propose FineDance, which contains 14. 6 hours of music-dance paired data, with fine-grained hand motions, fine-grained genres (22 dance genres), and accurate posture.

Motion Synthesis Retrieval

Enhanced Training of Query-Based Object Detection via Selective Query Recollection

2 code implementations CVPR 2023 Fangyi Chen, Han Zhang, Kai Hu, Yu-Kai Huang, Chenchen Zhu, Marios Savvides

This paper investigates a phenomenon where query-based object detectors mispredict at the last decoding stage while predicting correctly at an intermediate stage.

Attribute Object +2

Muse: Text-To-Image Generation via Masked Generative Transformers

4 code implementations2 Jan 2023 Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding.

 Ranked #1 on Text-to-Image Generation on MS-COCO (FID metric)

Language Modelling Large Language Model +1

Evolution of Filter Bubbles and Polarization in News Recommendation

no code implementations26 Jan 2023 Han Zhang, Ziwei Zhu, James Caverlee

However, most existing work focuses on a static setting or over a short-time window, leaving open questions about the long-term and dynamic impacts of news recommendations.

News Recommendation Recommendation Systems

VQ3D: Learning a 3D-Aware Generative Model on ImageNet

no code implementations ICCV 2023 Kyle Sargent, Jing Yu Koh, Han Zhang, Huiwen Chang, Charles Herrmann, Pratul Srinivasan, Jiajun Wu, Deqing Sun

Recent work has shown the possibility of training generative models of 3D content from 2D image collections on small datasets corresponding to a single object class, such as human faces, animal faces, or cars.

Position

StraIT: Non-autoregressive Generation with Stratified Image Transformer

no code implementations1 Mar 2023 Shengju Qian, Huiwen Chang, Yuanzhen Li, Zizhao Zhang, Jiaya Jia, Han Zhang

We propose Stratified Image Transformer(StraIT), a pure non-autoregressive(NAR) generative model that demonstrates superiority in high-quality image synthesis over existing autoregressive(AR) and diffusion models(DMs).

Image Generation

Streaming Kernel PCA Algorithm With Small Space

no code implementations8 Mar 2023 Yichuan Deng, Zhao Song, Zifan Wang, Han Zhang

The kernel method, which is commonly used in learning algorithms such as Support Vector Machines (SVMs), has also been applied in PCA algorithms.

Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning

2 code implementations16 Mar 2023 Zhuowei Li, Long Zhao, Zizhao Zhang, Han Zhang, Di Liu, Ting Liu, Dimitris N. Metaxas

In the context of continual learning, prototypes-as representative class embeddings-offer advantages in memory conservation and the mitigation of catastrophic forgetting.

Class Incremental Learning Contrastive Learning +1

Delayed and Indirect Impacts of Link Recommendations

no code implementations17 Mar 2023 Han Zhang, Shangen Lu, Yixin Wang, Mihaela Curmei

Moreover, we show that the effects of recommendations can persist in networks, in part due to their indirect impacts on natural dynamics even after recommendations are turned off.

counterfactual

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning

1 code implementation ICCV 2023 Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang

Diffusion models have achieved remarkable success in text-to-image generation, enabling the creation of high-quality images from text prompts or other modalities.

Data Augmentation Efficient Diffusion Personalization +1

Identity Encoder for Personalized Diffusion

no code implementations14 Apr 2023 Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia

Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.

Image Enhancement Image Generation

Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

1 code implementation24 Apr 2023 Chenlin Zhou, Liutao Yu, Zhaokun Zhou, Zhengyu Ma, Han Zhang, Huihui Zhou, Yonghong Tian

Based on this residual design, we develop Spikingformer, a pure transformer-based spiking neural network.

Mining fMRI Dynamics with Parcellation Prior for Brain Disease Diagnosis

no code implementations4 May 2023 Xiaozhao Liu, Mianxin Liu, Lang Mei, Yuyao Zhang, Feng Shi, Han Zhang, Dinggang Shen

To characterize atypical brain dynamics under diseases, prevalent studies investigate functional magnetic resonance imaging (fMRI).

Graph Learning Multiple Instance Learning

Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation

1 code implementation NeurIPS 2023 Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell

As such, we explore how natural language descriptions of the domains seen in training data can be used with large vision models trained on diverse pretraining datasets to generate useful variations of the training data.

Domain Generalization Image Augmentation

Learning Disentangled Prompts for Compositional Image Synthesis

no code implementations1 Jun 2023 Kihyuk Sohn, Albert Shaw, Yuan Hao, Han Zhang, Luisa Polania, Huiwen Chang, Lu Jiang, Irfan Essa

We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis.

Domain Adaptation Image Generation +1

Eliminating Lipschitz Singularities in Diffusion Models

no code implementations20 Jun 2023 Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which employ stochastic differential equations to sample images through integrals, have emerged as a dominant class of generative models.

ArrayBot: Reinforcement Learning for Generalizable Distributed Manipulation through Touch

no code implementations29 Jun 2023 Zhengrong Xue, Han Zhang, Jingwen Cheng, Zhengmao He, Yuanchen Ju, Changyi Lin, Gu Zhang, Huazhe Xu

We present ArrayBot, a distributed manipulation system consisting of a $16 \times 16$ array of vertically sliding pillars integrated with tactile sensors, which can simultaneously support, perceive, and manipulate the tabletop objects.

reinforcement-learning Reinforcement Learning (RL)

Errors Dynamics in Affine Group Systems

no code implementations31 Jul 2023 Xinghan Li, Jianqi Chen, Han Zhang, Jieqiang Wei, Junfeng Wu

In this paper, we focus on the error dynamics analysis for an affine group system under external disturbances or random noises.

NEON: Living Needs Prediction System in Meituan

no code implementations31 Jul 2023 Xiaochong Lan, Chen Gao, Shiqi Wen, Xiuqi Chen, Yingge Che, Han Zhang, Huazhou Wei, Hengliang Luo, Yong Li

To address these two challenges, we design a system of living NEeds predictiON named NEON, consisting of three phases: feature mining, feature fusion, and multi-task prediction.

Differentiable Retrieval Augmentation via Generative Language Modeling for E-commerce Query Intent Classification

no code implementations18 Aug 2023 Chenyu Zhao, Yunjiang Jiang, Yiming Qiu, Han Zhang, Wen-Yun Yang

Retrieval augmentation, which enhances downstream models by a knowledge retriever and an external corpus instead of by merely increasing the number of model parameters, has been successfully applied to many natural language processing (NLP) tasks such as text classification, question answering and so on.

intent-classification Intent Classification +5

Price-Discrimination Game for Distributed Resource Management in Federated Learning

no code implementations26 Aug 2023 Han Zhang, Halvin Yang, Guopeng Zhang

In vanilla federated learning (FL) such as FedAvg, the parameter server (PS) and multiple distributed clients can form a typical buyer's market, where the number of PS/buyers of FL services is far less than the number of clients/sellers.

Federated Learning Management

Chat2Brain: A Method for Mapping Open-Ended Semantic Queries to Brain Activation Maps

no code implementations10 Sep 2023 Yaonai Wei, Tuo Zhang, Han Zhang, Tianyang Zhong, Lin Zhao, Zhengliang Liu, Chong Ma, Songyao Zhang, Muheng Shang, Lei Du, Xiao Li, Tianming Liu, Junwei Han

In this study, we propose a method called Chat2Brain that combines LLMs to basic text-2-image model, known as Text2Brain, to map open-ended semantic queries to brain activation maps in data-scarce and complex query environments.

Deformation-Invariant Neural Network and Its Applications in Distorted Image Restoration and Analysis

no code implementations4 Oct 2023 Han Zhang, Qiguang Chen, Lok Ming Lui

The QCTN is a deep neural network that outputs a quasiconformal map, which can be used to transform a geometrically distorted image into an improved version that is closer to the distribution of natural or good images.

Image Classification Image Restoration +1

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency

no code implementations5 Oct 2023 Tianhong Li, Sangnie Bhardwaj, Yonglong Tian, Han Zhang, Jarred Barber, Dina Katabi, Guillaume Lajoie, Huiwen Chang, Dilip Krishnan

We demonstrate image generation and captioning performance on par with state-of-the-art text-to-image and image-to-text models with orders of magnitude fewer (only 3M) paired image-text data.

Text-to-Image Generation

Accurate battery lifetime prediction across diverse aging conditions with deep learning

no code implementations8 Oct 2023 Han Zhang, Yuqi Li, Shun Zheng, Ziheng Lu, Xiaofan Gui, Wei Xu, Jiang Bian

Here we introduce a universal deep learning approach that is capable of accommodating various aging conditions and facilitating effective learning under low-resource conditions by leveraging data from rich conditions.

Single-cell modeling

Towards Foundation Models for Learning on Tabular Data

no code implementations11 Oct 2023 Han Zhang, Xumeng Wen, Shun Zheng, Wei Xu, Jiang Bian

Despite considerable efforts in developing effective learning models for tabular data, current transferable tabular models remain in their infancy, limited by either the lack of support for direct instruction following in new tasks or the neglect of acquiring foundational knowledge and capabilities from diverse tabular datasets.

Instruction Following Language Modelling +1

BatteryML:An Open-source platform for Machine Learning on Battery Degradation

1 code implementation23 Oct 2023 Han Zhang, Xiaofan Gui, Shun Zheng, Ziheng Lu, Yuqi Li, Jiang Bian

Battery degradation remains a pivotal concern in the energy storage domain, with machine learning emerging as a potent tool to drive forward insights and solutions.

COPR: Continual Learning Human Preference through Optimal Policy Regularization

no code implementations24 Oct 2023 Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu

The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences.

Continual Learning reinforcement-learning

GFS: Graph-based Feature Synthesis for Prediction over Relational Databases

no code implementations4 Dec 2023 Han Zhang, Quan Gan, David Wipf, Weinan Zhang

Consequently, the prevalent approach for training machine learning models on data stored in relational databases involves performing feature engineering to merge the data from multiple tables into a single table and subsequently applying single table models.

Feature Engineering Inductive Bias

CCM: Adding Conditional Controls to Text-to-Image Consistency Models

no code implementations12 Dec 2023 Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha

Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality.

An Incentive Mechanism for Federated Learning Based on Multiple Resource Exchange

no code implementations13 Dec 2023 Ruonan Dong, Hui Xu, Han Zhang, Guopeng Zhang

We formulate the interaction between MO and DOs as an optimization problem, and the objective is to effectively utilize the communication and computing resource of the MO and DOs to minimize the time to complete an FL task.

Federated Learning

VideoLCM: Video Latent Consistency Model

2 code implementations14 Dec 2023 Xiang Wang, Shiwei Zhang, Han Zhang, Yu Liu, Yingya Zhang, Changxin Gao, Nong Sang

Consistency models have demonstrated powerful capability in efficient image generation and allowed synthesis within a few sampling steps, alleviating the high computational cost in diffusion models.

Computational Efficiency Image Generation +1

Gemini: A Family of Highly Capable Multimodal Models

no code implementations The Keyword 2023 Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Jack Krawczyk, Ed Chi, Heng-Tze Cheng, Eric Ni, Purvi Shah, Patrick Kane, Betty Chan, Manaal Faruqui, Aliaksei Severyn, Hanzhao Lin, Yaguang Li, Yong Cheng, Mahdis Mahdieh, Mia Chen, Pei Sun, Dustin Tran, Sumit Bagri, Balaji Lakshminarayanan, Jeremiah Liu, Andras Orban, Fabian Güra, Hao Zhou, Xinying Song, Aurelien Boffy, Harish Ganapathy, Steven Zheng, HyunJeong Choe, Ágoston Weisz, Tao Zhu, Yifeng Lu, Siddharth Gopal, Jarrod Kahn, Maciej Kula, Jeff Pitman, Rushin Shah, Emanuel Taropa, Majd Al Merey, Martin Baeuml, Zhifeng Chen, Laurent El Shafey, Yujing Zhang, Olcan Sercinoglu, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, Alexandre Frechette, Charlotte Smith, Laura Culp, Lev Proleev, Yi Luan, Xi Chen, James Lottes, Nathan Schucher, Federico Lebron, Alban Rrustemi, Natalie Clay, Phil Crone, Tomas Kocisky, Jeffrey Zhao, Bartek Perz, Dian Yu, Heidi Howard, Adam Bloniarz, Jack W. Rae, Han Lu, Laurent SIfre, Marcello Maggioni, Fred Alcober, Dan Garrette, Megan Barnes, Shantanu Thakoor, Jacob Austin, Gabriel Barth-Maron, William Wong, Rishabh Joshi, Rahma Chaabouni, Deeni Fatiha, Arun Ahuja, Gaurav Singh Tomar, Evan Senter, Martin Chadwick, Ilya Kornakov, Nithya Attaluri, Iñaki Iturrate, Ruibo Liu, Yunxuan Li, Sarah Cogan, Jeremy Chen, Chao Jia, Chenjie Gu, Qiao Zhang, Jordan Grimstad, Ale Jakse Hartman, Xavier Garcia, Thanumalayan Sankaranarayana Pillai, Jacob Devlin, Michael Laskin, Diego de Las Casas, Dasha Valter, Connie Tao, Lorenzo Blanco, Adrià Puigdomènech Badia, David Reitter, Mianna Chen, Jenny Brennan, Clara Rivera, Sergey Brin, Shariq Iqbal, Gabriela Surita, Jane Labanowski, Abhi Rao, Stephanie Winkler, Emilio Parisotto, Yiming Gu, Kate Olszewska, Ravi Addanki, Antoine Miech, Annie Louis, Denis Teplyashin, Geoff Brown, Elliot Catt, Jan Balaguer, Jackie Xiang, Pidong Wang, Zoe Ashwood, Anton Briukhov, Albert Webson, Sanjay Ganapathy, Smit Sanghavi, Ajay Kannan, Ming-Wei Chang, Axel Stjerngren, Josip Djolonga, Yuting Sun, Ankur Bapna, Matthew Aitchison, Pedram Pejman, Henryk Michalewski, Tianhe Yu, Cindy Wang, Juliette Love, Junwhan Ahn, Dawn Bloxwich, Kehang Han, Peter Humphreys, Thibault Sellam, James Bradbury, Varun Godbole, Sina Samangooei, Bogdan Damoc, Alex Kaskasoli, Sébastien M. R. Arnold, Vijay Vasudevan, Shubham Agrawal, Jason Riesa, Dmitry Lepikhin, Richard Tanburn, Srivatsan Srinivasan, Hyeontaek Lim, Sarah Hodkinson, Pranav Shyam, Johan Ferret, Steven Hand, Ankush Garg, Tom Le Paine, Jian Li, Yujia Li, Minh Giang, Alexander Neitz, Zaheer Abbas, Sarah York, Machel Reid, Elizabeth Cole, Aakanksha Chowdhery, Dipanjan Das, Dominika Rogozińska, Vitaliy Nikolaev, Pablo Sprechmann, Zachary Nado, Lukas Zilka, Flavien Prost, Luheng He, Marianne Monteiro, Gaurav Mishra, Chris Welty, Josh Newlan, Dawei Jia, Miltiadis Allamanis, Clara Huiyi Hu, Raoul de Liedekerke, Justin Gilmer, Carl Saroufim, Shruti Rijhwani, Shaobo Hou, Disha Shrivastava, Anirudh Baddepudi, Alex Goldin, Adnan Ozturel, Albin Cassirer, Yunhan Xu, Daniel Sohn, Devendra Sachan, Reinald Kim Amplayo, Craig Swanson, Dessie Petrova, Shashi Narayan, Arthur Guez, Siddhartha Brahma, Jessica Landon, Miteyan Patel, Ruizhe Zhao, Kevin Villela, Luyu Wang, Wenhao Jia, Matthew Rahtz, Mai Giménez, Legg Yeung, James Keeling, Petko Georgiev, Diana Mincu, Boxi Wu, Salem Haykal, Rachel Saputro, Kiran Vodrahalli, James Qin, Zeynep Cankara, Abhanshu Sharma, Nick Fernando, Will Hawkins, Behnam Neyshabur, Solomon Kim, Adrian Hutter, Priyanka Agrawal, Alex Castro-Ros, George van den Driessche, Tao Wang, Shuo-Yiin Chang, Paul Komarek, Ross Mcilroy, Mario Lučić, Guodong Zhang, Wael Farhan, Michael Sharman, Paul Natsev, Paul Michel, Yamini Bansal, Siyuan Qiao, Kris Cao, Siamak Shakeri, Christina Butterfield, Justin Chung, Paul Kishan Rubenstein, Shivani Agrawal, Arthur Mensch, Kedar Soparkar, Karel Lenc, Timothy Chung, Aedan Pope, Loren Maggiore, Jackie Kay, Priya Jhakra, Shibo Wang, Joshua Maynez, Mary Phuong, Taylor Tobin, Andrea Tacchetti, Maja Trebacz, Kevin Robinson, Yash Katariya, Sebastian Riedel, Paige Bailey, Kefan Xiao, Nimesh Ghelani, Lora Aroyo, Ambrose Slone, Neil Houlsby, Xuehan Xiong, Zhen Yang, Elena Gribovskaya, Jonas Adler, Mateo Wirth, Lisa Lee, Music Li, Thais Kagohara, Jay Pavagadhi, Sophie Bridgers, Anna Bortsova, Sanjay Ghemawat, Zafarali Ahmed, Tianqi Liu, Richard Powell, Vijay Bolina, Mariko Iinuma, Polina Zablotskaia, James Besley, Da-Woon Chung, Timothy Dozat, Ramona Comanescu, Xiance Si, Jeremy Greer, Guolong Su, Martin Polacek, Raphaël Lopez Kaufman, Simon Tokumine, Hexiang Hu, Elena Buchatskaya, Yingjie Miao, Mohamed Elhawaty, Aditya Siddhant, Nenad Tomasev, Jinwei Xing, Christina Greer, Helen Miller, Shereen Ashraf, Aurko Roy, Zizhao Zhang, Ada Ma, Angelos Filos, Milos Besta, Rory Blevins, Ted Klimenko, Chih-Kuan Yeh, Soravit Changpinyo, Jiaqi Mu, Oscar Chang, Mantas Pajarskas, Carrie Muir, Vered Cohen, Charline Le Lan, Krishna Haridasan, Amit Marathe, Steven Hansen, Sholto Douglas, Rajkumar Samuel, Mingqiu Wang, Sophia Austin, Chang Lan, Jiepu Jiang, Justin Chiu, Jaime Alonso Lorenzo, Lars Lowe Sjösund, Sébastien Cevey, Zach Gleicher, Thi Avrahami, Anudhyan Boral, Hansa Srinivasan, Vittorio Selo, Rhys May, Konstantinos Aisopos, Léonard Hussenot, Livio Baldini Soares, Kate Baumli, Michael B. Chang, Adrià Recasens, Ben Caine, Alexander Pritzel, Filip Pavetic, Fabio Pardo, Anita Gergely, Justin Frye, Vinay Ramasesh, Dan Horgan, Kartikeya Badola, Nora Kassner, Subhrajit Roy, Ethan Dyer, Víctor Campos Campos, Alex Tomala, Yunhao Tang, Dalia El Badawy, Elspeth White, Basil Mustafa, Oran Lang, Abhishek Jindal, Sharad Vikram, Zhitao Gong, Sergi Caelles, Ross Hemsley, Gregory Thornton, Fangxiaoyu Feng, Wojciech Stokowiec, Ce Zheng, Phoebe Thacker, Çağlar Ünlü, Zhishuai Zhang, Mohammad Saleh, James Svensson, Max Bileschi, Piyush Patil, Ankesh Anand, Roman Ring, Katerina Tsihlas, Arpi Vezer, Marco Selvi, Toby Shevlane, Mikel Rodriguez, Tom Kwiatkowski, Samira Daruki, Keran Rong, Allan Dafoe, Nicholas FitzGerald, Keren Gu-Lemberg, Mina Khan, Lisa Anne Hendricks, Marie Pellat, Vladimir Feinberg, James Cobon-Kerr, Tara Sainath, Maribeth Rauh, Sayed Hadi Hashemi, Richard Ives, Yana Hasson, Eric Noland, Yuan Cao, Nathan Byrd, Le Hou, Qingze Wang, Thibault Sottiaux, Michela Paganini, Jean-Baptiste Lespiau, Alexandre Moufarek, Samer Hassan, Kaushik Shivakumar, Joost van Amersfoort, Amol Mandhane, Pratik Joshi, Anirudh Goyal, Matthew Tung, Andrew Brock, Hannah Sheahan, Vedant Misra, Cheng Li, Nemanja Rakićević, Mostafa Dehghani, Fangyu Liu, Sid Mittal, Junhyuk Oh, Seb Noury, Eren Sezener, Fantine Huot, Matthew Lamm, Nicola De Cao, Charlie Chen, Sidharth Mudgal, Romina Stella, Kevin Brooks, Gautam Vasudevan, Chenxi Liu, Mainak Chain, Nivedita Melinkeri, Aaron Cohen, Venus Wang, Kristie Seymore, Sergey Zubkov, Rahul Goel, Summer Yue, Sai Krishnakumaran, Brian Albert, Nate Hurley, Motoki Sano, Anhad Mohananey, Jonah Joughin, Egor Filonov, Tomasz Kępa, Yomna Eldawy, Jiawern Lim, Rahul Rishi, Shirin Badiezadegan, Taylor Bos, Jerry Chang, Sanil Jain, Sri Gayatri Sundara Padmanabhan, Subha Puttagunta, Kalpesh Krishna, Leslie Baker, Norbert Kalb, Vamsi Bedapudi, Shuntong Lei, Anthony Yu, Oren Litvin, Xiang Zhou, Zhichun Wu, Sam Sobell, Andrea Siciliano, Alan Papir, Robby Neale, Jonas Bragagnolo, Tej Toor, Tina Chen, Valentin Anklin, Feiran Wang, Richie Feng, Milad Gholami, Kevin Ling, Lijuan Liu, Jules Walter, Hamid Moghaddam, Arun Kishore, Jakub Adamek, Tyler Mercado, Jonathan Mallinson, Siddhinita Wandekar, Stephen Cagle, Eran Ofek, Guillermo Garrido, Clemens Lombriser, Maksim Mukha, Botu Sun, Hafeezul Rahman Mohammad, Josip Matak, Yadi Qian, Vikas Peswani, Pawel Janus, Quan Yuan, Leif Schelin, Oana David, Ankur Garg, Yifan He, Oleksii Duzhyi, Anton Älgmyr, Timothée Lottaz, Qi Li, Vikas Yadav, Luyao Xu, Alex Chinien, Rakesh Shivanna, Aleksandr Chuklin, Josie Li, Carrie Spadine, Travis Wolfe, Kareem Mohamed, Subhabrata Das, Zihang Dai, Kyle He, Daniel von Dincklage, Shyam Upadhyay, Akanksha Maurya, Luyan Chi, Sebastian Krause, Khalid Salama, Pam G Rabinovitch, Pavan Kumar Reddy M, Aarush Selvan, Mikhail Dektiarev, Golnaz Ghiasi, Erdem Guven, Himanshu Gupta, Boyi Liu, Deepak Sharma, Idan Heimlich Shtacher, Shachi Paul, Oscar Akerlund, François-Xavier Aubet, Terry Huang, Chen Zhu, Eric Zhu, Elico Teixeira, Matthew Fritze, Francesco Bertolini, Liana-Eleonora Marinescu, Martin Bölle, Dominik Paulus, Khyatti Gupta, Tejasi Latkar, Max Chang, Jason Sanders, Roopa Wilson, Xuewei Wu, Yi-Xuan Tan, Lam Nguyen Thiet, Tulsee Doshi, Sid Lall, Swaroop Mishra, Wanming Chen, Thang Luong, Seth Benjamin, Jasmine Lee, Ewa Andrejczuk, Dominik Rabiej, Vipul Ranjan, Krzysztof Styrc, Pengcheng Yin, Jon Simon, Malcolm Rose Harriott, Mudit Bansal, Alexei Robsky, Geoff Bacon, David Greene, Daniil Mirylenka, Chen Zhou, Obaid Sarvana, Abhimanyu Goyal, Samuel Andermatt, Patrick Siegler, Ben Horn, Assaf Israel, Francesco Pongetti, Chih-Wei "Louis" Chen, Marco Selvatici, Pedro Silva, Kathie Wang, Jackson Tolins, Kelvin Guu, Roey Yogev, Xiaochen Cai, Alessandro Agostini, Maulik Shah, Hung Nguyen, Noah Ó Donnaile, Sébastien Pereira, Linda Friso, Adam Stambler, Adam Kurzrok, Chenkai Kuang, Yan Romanikhin, Mark Geller, ZJ Yan, Kane Jang, Cheng-Chun Lee, Wojciech Fica, Eric Malmi, Qijun Tan, Dan Banica, Daniel Balle, Ryan Pham, Yanping Huang, Diana Avram, Hongzhi Shi, Jasjot Singh, Chris Hidey, Niharika Ahuja, Pranab Saxena, Dan Dooley, Srividya Pranavi Potharaju, Eileen O'Neill, Anand Gokulchandran, Ryan Foley, Kai Zhao, Mike Dusenberry, YuAn Liu, Pulkit Mehta, Ragha Kotikalapudi, Chalence Safranek-Shrader, Andrew Goodman, Joshua Kessinger, Eran Globen, Prateek Kolhar, Chris Gorgolewski, Ali Ibrahim, Yang song, Ali Eichenbaum, Thomas Brovelli, Sahitya Potluri, Preethi Lahoti, Cip Baetu, Ali Ghorbani, Charles Chen, Andy Crawford, Shalini Pal, Mukund Sridhar, Petru Gurita, Asier Mujika, Igor Petrovski, Pierre-Louis Cedoz, Chenmei Li, Shiyuan Chen, Niccolò Dal Santo, Siddharth Goyal, Jitesh Punjabi, Karthik Kappaganthu, Chester Kwak, Pallavi LV, Sarmishta Velury, Himadri Choudhury, Jamie Hall, Premal Shah, Ricardo Figueira, Matt Thomas, Minjie Lu, Ting Zhou, Chintu Kumar, Thomas Jurdi, Sharat Chikkerur, Yenai Ma, Adams Yu, Soo Kwak, Victor Ähdel, Sujeevan Rajayogam, Travis Choma, Fei Liu, Aditya Barua, Colin Ji, Ji Ho Park, Vincent Hellendoorn, Alex Bailey, Taylan Bilal, Huanjie Zhou, Mehrdad Khatir, Charles Sutton, Wojciech Rzadkowski, Fiona Macintosh, Konstantin Shagin, Paul Medina, Jinjing Zhou, Pararth Shah, Yingying Bi, Attila Dankovics, Shipra Banga, Sabine Lehmann, Marissa Bredesen, Zifan Lin, John Eric Hoffmann, Jonathan Lai, Raynald Chung, Kai Yang, Nihal Balani, Arthur Bražinskas, Andrei Sozanschi, Matthew Hayes, Héctor Fernández Alcalde, Peter Makarov, Will Chen, Antonio Stella, Liselotte Snijders, Michael Mandl, Ante Kärrman, Paweł Nowak, Xinyi Wu, Alex Dyck, Krishnan Vaidyanathan, Raghavender R, Jessica Mallet, Mitch Rudominer, Eric Johnston, Sushil Mittal, Akhil Udathu, Janara Christensen, Vishal Verma, Zach Irving, Andreas Santucci, Gamaleldin Elsayed, Elnaz Davoodi, Marin Georgiev, Ian Tenney, Geoffrey Cideron, Edouard Leurent, Mahmoud Alnahlawi, Ionut Georgescu, Nan Wei, Ivy Zheng, Dylan Scandinaro, Heinrich Jiang, Jasper Snoek, Mukund Sundararajan, Xuezhi Wang, Zack Ontiveros, Itay Karo, Jeremy Cole, Vinu Rajashekhar, Lara Tumeh, Eyal Ben-David, Rishub Jain, Jonathan Uesato, Romina Datta, Oskar Bunyan, Shimu Wu, John Zhang, Piotr Stanczyk, Ye Zhang, David Steiner, Subhajit Naskar, Michael Azzam, Matthew Johnson, Adam Paszke, Chung-Cheng Chiu, Jaume Sanchez Elias, Afroz Mohiuddin, Faizan Muhammad, Jin Miao, Andrew Lee, Nino Vieillard, Jane Park, Jiageng Zhang, Jeff Stanway, Drew Garmon, Abhijit Karmarkar, Zhe Dong, Jong Lee, Aviral Kumar, Luowei Zhou, Jonathan Evens, William Isaac, Geoffrey Irving, Edward Loper, Michael Fink, Isha Arkatkar, Nanxin Chen, Izhak Shafran, Ivan Petrychenko, Zhe Chen, Johnson Jia, Anselm Levskaya, Zhenkai Zhu, Peter Grabowski, Yu Mao, Alberto Magni, Kaisheng Yao, Javier Snaider, Norman Casagrande, Evan Palmer, Paul Suganthan, Alfonso Castaño, Irene Giannoumis, Wooyeol Kim, Mikołaj Rybiński, Ashwin Sreevatsa, Jennifer Prendki, David Soergel, Adrian Goedeckemeyer, Willi Gierke, Mohsen Jafari, Meenu Gaba, Jeremy Wiesner, Diana Gage Wright, Yawen Wei, Harsha Vashisht, Yana Kulizhskaya, Jay Hoover, Maigo Le, Lu Li, Chimezie Iwuanyanwu, Lu Liu, Kevin Ramirez, Andrey Khorlin, Albert Cui, Tian Lin, Marcus Wu, Ricardo Aguilar, Keith Pallo, Abhishek Chakladar, Ginger Perng, Elena Allica Abellan, Mingyang Zhang, Ishita Dasgupta, Nate Kushman, Ivo Penchev, Alena Repina, Xihui Wu, Tom van der Weide, Priya Ponnapalli, Caroline Kaplan, Jiri Simsa, Shuangfeng Li, Olivier Dousse, Jeff Piper, Nathan Ie, Rama Pasumarthi, Nathan Lintz, Anitha Vijayakumar, Daniel Andor, Pedro Valenzuela, Minnie Lui, Cosmin Paduraru, Daiyi Peng, Katherine Lee, Shuyuan Zhang, Somer Greene, Duc Dung Nguyen, Paula Kurylowicz, Cassidy Hardin, Lucas Dixon, Lili Janzer, Kiam Choo, Ziqiang Feng, Biao Zhang, Achintya Singhal, Dayou Du, Dan McKinnon, Natasha Antropova, Tolga Bolukbasi, Orgad Keller, David Reid, Daniel Finchelstein, Maria Abi Raad, Remi Crocker, Peter Hawkins, Robert Dadashi, Colin Gaffney, Ken Franko, Anna Bulanova, Rémi Leblond, Shirley Chung, Harry Askham, Luis C. Cobo, Kelvin Xu, Felix Fischer, Jun Xu, Christina Sorokin, Chris Alberti, Chu-Cheng Lin, Colin Evans, Alek Dimitriev, Hannah Forbes, Dylan Banarse, Zora Tung, Mark Omernick, Colton Bishop, Rachel Sterneck, Rohan Jain, Jiawei Xia, Ehsan Amid, Francesco Piccinno, Xingyu Wang, Praseem Banzal, Daniel J. Mankowitz, Alex Polozov, Victoria Krakovna, Sasha Brown, Mohammadhossein Bateni, Dennis Duan, Vlad Firoiu, Meghana Thotakuri, Tom Natan, Matthieu Geist, Ser tan Girgin, Hui Li, Jiayu Ye, Ofir Roval, Reiko Tojo, Michael Kwong, James Lee-Thorp, Christopher Yew, Danila Sinopalnikov, Sabela Ramos, John Mellor, Abhishek Sharma, Kathy Wu, David Miller, Nicolas Sonnerat, Denis Vnukov, Rory Greig, Jennifer Beattie, Emily Caveness, Libin Bai, Julian Eisenschlos, Alex Korchemniy, Tomy Tsai, Mimi Jasarevic, Weize Kong, Phuong Dao, Zeyu Zheng, Frederick Liu, Fan Yang, Rui Zhu, Tian Huey Teh, Jason Sanmiya, Evgeny Gladchenko, Nejc Trdin, Daniel Toyama, Evan Rosen, Sasan Tavakkol, Linting Xue, Chen Elkind, Oliver Woodman, John Carpenter, George Papamakarios, Rupert Kemp, Sushant Kafle, Tanya Grunina, Rishika Sinha, Alice Talbert, Diane Wu, Denese Owusu-Afriyie, Cosmo Du, Chloe Thornton, Jordi Pont-Tuset, Pradyumna Narayana, Jing Li, Saaber Fatehi, John Wieting, Omar Ajmeri, Benigno Uria, Yeongil Ko, Laura Knight, Amélie Héliou, Ning Niu, Shane Gu, Chenxi Pang, Yeqing Li, Nir Levine, Ariel Stolovich, Rebeca Santamaria-Fernandez, Sonam Goenka, Wenny Yustalim, Robin Strudel, Ali Elqursh, Charlie Deck, Hyo Lee, Zonglin Li, Kyle Levin, Raphael Hoffmann, Dan Holtmann-Rice, Olivier Bachem, Sho Arora, Christy Koh, Soheil Hassas Yeganeh, Siim Põder, Mukarram Tariq, Yanhua Sun, Lucian Ionita, Mojtaba Seyedhosseini, Pouya Tafti, Zhiyu Liu, Anmol Gulati, Jasmine Liu, Xinyu Ye, Bart Chrzaszcz, Lily Wang, Nikhil Sethi, Tianrun Li, Ben Brown, Shreya Singh, Wei Fan, Aaron Parisi, Joe Stanton, Vinod Koverkathu, Christopher A. Choquette-Choo, Yunjie Li, TJ Lu, Abe Ittycheriah, Prakash Shroff, Mani Varadarajan, Sanaz Bahargam, Rob Willoughby, David Gaddy, Guillaume Desjardins, Marco Cornero, Brona Robenek, Bhavishya Mittal, Ben Albrecht, Ashish Shenoy, Fedor Moiseev, Henrik Jacobsson, Alireza Ghaffarkhah, Morgane Rivière, Alanna Walton, Clément Crepy, Alicia Parrish, Zongwei Zhou, Clement Farabet, Carey Radebaugh, Praveen Srinivasan, Claudia van der Salm, Andreas Fidjeland, Salvatore Scellato, Eri Latorre-Chimoto, Hanna Klimczak-Plucińska, David Bridson, Dario de Cesare, Tom Hudson, Piermaria Mendolicchio, Lexi Walker, Alex Morris, Matthew Mauger, Alexey Guseynov, Alison Reid, Seth Odoom, Lucia Loher, Victor Cotruta, Madhavi Yenugula, Dominik Grewe, Anastasia Petrushkina, Tom Duerig, Antonio Sanchez, Steve Yadlowsky, Amy Shen, Amir Globerson, Lynette Webb, Sahil Dua, Dong Li, Surya Bhupatiraju, Dan Hurt, Haroon Qureshi, Ananth Agarwal, Tomer Shani, Matan Eyal, Anuj Khare, Shreyas Rammohan Belle, Lei Wang, Chetan Tekur, Mihir Sanjay Kale, Jinliang Wei, Ruoxin Sang, Brennan Saeta, Tyler Liechty, Yao Zhao, Stephan Lee, Pandu Nayak, Doug Fritz, Manish Reddy Vuyyuru, John Aslanides, Nidhi Vyas, Martin Wicke, Xiao Ma, Evgenii Eltyshev, Nina Martin, Hardie Cate, James Manyika, Keyvan Amiri, Yelin Kim, Xi Xiong, Kai Kang, Florian Luisier, Nilesh Tripuraneni, David Madras, Mandy Guo, Austin Waters, Oliver Wang, Joshua Ainslie, Jason Baldridge, Han Zhang, Garima Pruthi, Jakob Bauer, Feng Yang, Riham Mansour, Jason Gelman, Yang Xu, George Polovets, Ji Liu, Honglong Cai, Warren Chen, XiangHai Sheng, Emily Xue, Sherjil Ozair, Christof Angermueller, Xiaowei Li, Anoop Sinha, Weiren Wang, Julia Wiesinger, Emmanouil Koukoumidis, Yuan Tian, Anand Iyer, Madhu Gurumurthy, Mark Goldenson, Parashar Shah, MK Blake, Hongkun Yu, Anthony Urbanowicz, Jennimaria Palomaki, Chrisantha Fernando, Ken Durden, Harsh Mehta, Nikola Momchev, Elahe Rahimtoroghi, Maria Georgaki, Amit Raul, Sebastian Ruder, Morgan Redshaw, Jinhyuk Lee, Denny Zhou, Komal Jalan, Dinghua Li, Blake Hechtman, Parker Schuh, Milad Nasr, Kieran Milan, Vladimir Mikulik, Juliana Franco, Tim Green, Nam Nguyen, Joe Kelley, Aroma Mahendru, Andrea Hu, Joshua Howland, Ben Vargas, Jeffrey Hui, Kshitij Bansal, Vikram Rao, Rakesh Ghiya, Emma Wang, Ke Ye, Jean Michel Sarr, Melanie Moranski Preston, Madeleine Elish, Steve Li, Aakash Kaku, Jigar Gupta, Ice Pasupat, Da-Cheng Juan, Milan Someswar, Tejvi M., Xinyun Chen, Aida Amini, Alex Fabrikant, Eric Chu, Xuanyi Dong, Amruta Muthal, Senaka Buthpitiya, Sarthak Jauhari, Nan Hua, Urvashi Khandelwal, Ayal Hitron, Jie Ren, Larissa Rinaldi, Shahar Drath, Avigail Dabush, Nan-Jiang Jiang, Harshal Godhia, Uli Sachs, Anthony Chen, Yicheng Fan, Hagai Taitelbaum, Hila Noga, Zhuyun Dai, James Wang, Chen Liang, Jenny Hamer, Chun-Sung Ferng, Chenel Elkind, Aviel Atias, Paulina Lee, Vít Listík, Mathias Carlen, Jan van de Kerkhof, Marcin Pikus, Krunoslav Zaher, Paul Müller, Sasha Zykova, Richard Stefanec, Vitaly Gatsko, Christoph Hirnschall, Ashwin Sethi, Xingyu Federico Xu, Chetan Ahuja, Beth Tsai, Anca Stefanoiu, Bo Feng, Keshav Dhandhania, Manish Katyal, Akshay Gupta, Atharva Parulekar, Divya Pitta, Jing Zhao, Vivaan Bhatia, Yashodha Bhavnani, Omar Alhadlaq, Xiaolin Li, Peter Danenberg, Dennis Tu, Alex Pine, Vera Filippova, Abhipso Ghosh, Ben Limonchik, Bhargava Urala, Chaitanya Krishna Lanka, Derik Clive, Yi Sun, Edward Li, Hao Wu, Kevin Hongtongsak, Ianna Li, Kalind Thakkar, Kuanysh Omarov, Kushal Majmundar, Michael Alverson, Michael Kucharski, Mohak Patel, Mudit Jain, Maksim Zabelin, Paolo Pelagatti, Rohan Kohli, Saurabh Kumar, Joseph Kim, Swetha Sankar, Vineet Shah, Lakshmi Ramachandruni, Xiangkai Zeng, Ben Bariach, Laura Weidinger, Amar Subramanya, Sissie Hsiao, Demis Hassabis, Koray Kavukcuoglu, Adam Sadovsky, Quoc Le, Trevor Strohman, Yonghui Wu, Slav Petrov, Jeffrey Dean, Oriol Vinyals

This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding.

 Ranked #1 on Multi-task Language Understanding on MMLU (using extra training data)

Arithmetic Reasoning Code Generation +3

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

no code implementations30 Dec 2023 Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang

Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs).

Uncertainty Quantification

Fundamental Limitation of Semantic Communications: Neural Estimation for Rate-Distortion

no code implementations2 Jan 2024 Dongxu Li, Jianhao Huang, Chuan Huang, Xiaoqi Qin, Han Zhang, Ping Zhang

For the case with unknown semantic source distribution, while only a set of the source samples is available, we propose a neural-network-based method by leveraging the generative networks to learn the semantic source distribution.

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

no code implementations11 Jan 2024 Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, Gang Li, Sangpil Kim, Irfan Essa, Feng Yang

Additionally, Parrot employs a joint optimization approach for the T2I model and the prompt expansion network, facilitating the generation of quality-aware text prompts, thus further enhancing the final image quality.

Reinforcement Learning (RL) Text-to-Image Generation

CPPO: Continual Learning for Reinforcement Learning with Human Feedback

no code implementations Conference 2024 Han Zhang, Yu Lei, Lin Gui, Min Yang, Yulan He, Hui Wang, Ruifeng Xu

The approach of Reinforcement Learning from Human Feedback (RLHF) is widely used for enhancing pre-trained Language Models (LM), enabling them to better align with human preferences.

Continual Learning reinforcement-learning

Federated Learning with Dual Attention for Robust Modulation Classification under Attacks

no code implementations19 Jan 2024 Han Zhang, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci

To this end, we leverage attention mechanisms as a defense against attacks in FL and propose a robust FL algorithm by integrating the attention mechanisms into the global model aggregation step.

Data Poisoning Federated Learning

QIS : Interactive Segmentation via Quasi-Conformal Mappings

no code implementations22 Feb 2024 Han Zhang, Daoping Zhang, Lok Ming Lui

In this paper, we propose the quasi-conformal interactive segmentation (QIS) model, which incorporates user input in the form of positive and negative clicks.

Image Segmentation Interactive Segmentation +2

COPR: Continual Human Preference Learning via Optimal Policy Regularization

no code implementations22 Feb 2024 Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu

Reinforcement Learning from Human Feedback (RLHF) is commonly utilized to improve the alignment of Large Language Models (LLMs) with human preferences.

Continual Learning

MUC: Mixture of Uncalibrated Cameras for Robust 3D Human Body Reconstruction

no code implementations8 Mar 2024 Yitao Zhu, Sheng Wang, Mengjie Xu, Zixu Zhuang, Zhixin Wang, Kaidong Wang, Han Zhang, Qian Wang

Next, instead of simply averaging models across views, we train a network to determine the weights of individual views for their fusion, based on the parameters estimated for joints and hands of human body as well as camera positions.

FluoroSAM: A Language-aligned Foundation Model for X-ray Image Segmentation

1 code implementation12 Mar 2024 Benjamin D. Killeen, Liam J. Wang, Han Zhang, Mehran Armand, Russell H. Taylor, Dave Dreizin, Greg Osgood, Mathias Unberath

Recently, foundation models (FMs) -- machine learning models trained on large amounts of highly variable data thus enabling broad applicability -- have emerged as promising tools for automated image analysis.

Image Segmentation Semantic Segmentation +1

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

2 code implementations25 Mar 2024 Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Liwei Huang, Xiaopeng Fan, Li Yuan, Zhengyu Ma, Huihui Zhou, Yonghong Tian

ii) We incorporate the hierarchical structure, which significantly benefits the performance of both the brain and artificial neural networks, into spiking transformers to obtain multi-scale spiking representation.

Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion Networks

no code implementations29 Mar 2024 Luoyu Wang, Yitian Tao, Qing Yang, Yan Liang, Siwei Liu, Hongcheng Shi, Dinggang Shen, Han Zhang

To fully exploit the inherent complex and nonlinear relation among modalities while producing fine-grained representations for uni-modal inference, we subsequently add a modal alignment module to line up a dominant modality (e. g., PET) with representations of auxiliary modalities (MR).

Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation

no code implementations31 Mar 2024 Yitian Tao, Liyan Ma, Jing Yu, Han Zhang

To ensure the semantic consistency of the retrieved cross modal prior knowledge, a cross-modal semantic alignment module (SAM) is proposed.

A Multi-Level Framework for Accelerating Training Transformer Models

1 code implementation7 Apr 2024 Longwei Zou, Han Zhang, Yangdong Deng

Specifically, the framework is based on three basic operators, Coalescing, De-coalescing and Interpolation, which can be orchestrated to build a multi-level training framework.

Cannot find the paper you are looking for? You can Submit a new open access paper.