no code implementations • 2 Nov 2024 • Georgia Gabriela Sampaio, Ruixiang Zhang, Shuangfei Zhai, Jiatao Gu, Josh Susskind, Navdeep Jaitly, Yizhe Zhang
In this work, we focus on the text rendering aspect of these models, which provides a lens for evaluating a generative model's fine-grained instruction-following capabilities.
1 code implementation • 25 Jul 2024 • Haoran Zhu, Yifan Zhou, Chang Xu, Ruixiang Zhang, Wen Yang
This letter introduces Orthogonal Mapping (OM), a simple yet effective method aimed at addressing the challenge of semantic confusion inherent in FGOD.
1 code implementation • 22 Jul 2024 • He Bai, Tatiana Likhomanenko, Ruixiang Zhang, Zijin Gu, Zakaria Aldeneh, Navdeep Jaitly
Large language models have revolutionized natural language processing by leveraging self-supervised pretraining on vast textual data.
no code implementations • 2 Jun 2024 • Dinghuai Zhang, Yizhe Zhang, Jiatao Gu, Ruixiang Zhang, Josh Susskind, Navdeep Jaitly, Shuangfei Zhai
Diffusion models have become the de-facto approach for generating visual data, which are trained to match the distribution of the training dataset.
1 code implementation • 7 Mar 2024 • Yizhe Zhang, He Bai, Ruixiang Zhang, Jiatao Gu, Shuangfei Zhai, Josh Susskind, Navdeep Jaitly
Vision-Language Models (VLMs) have recently demonstrated incredible strides on diverse vision language tasks.
1 code implementation • 16 Jan 2024 • Haoran Zhu, Chang Xu, Wen Yang, Ruixiang Zhang, Yan Zhang, Gui-Song Xia
In this study, we address the intricate issue of tiny object detection under noisy label supervision.
no code implementations • 23 Oct 2023 • Ruixiang Zhang, Chang Xu, Fang Xu, Wen Yang, Guangjun He, Huai Yu, Gui-Song Xia
This paper focuses on the scale imbalance problem of semi-supervised object detection(SSOD) in aerial images.
no code implementations • 19 Oct 2022 • Yunsheng Zhang, Jianguo Yao, Ruixiang Zhang, Siyang Chen, Haifeng Li
Hence, this work proposes a hard-negative sample aware self-supervised contrastive learning method to pre-train the model for semantic segmentation.
no code implementations • 11 Oct 2022 • Ruixiang Zhang, Tong Che, Boris Ivanovic, Renhao Wang, Marco Pavone, Yoshua Bengio, Liam Paull
Humans are remarkably good at understanding and reasoning about complex visual scenes.
7 code implementations • 8 Aug 2022 • Ting Chen, Ruixiang Zhang, Geoffrey Hinton
The main idea behind our approach is to first represent the discrete data as binary bits, and then train a continuous diffusion model to model these bits as real numbers which we call analog bits.
Ranked #7 on Image Captioning on MS COCO
1 code implementation • 27 Feb 2022 • Song Wang, Jianke Zhu, Ruixiang Zhang
LiDAR sensor is essential to the perception system in autonomous vehicles and intelligent robots.
Ranked #18 on 3D Semantic Segmentation on SemanticKITTI
no code implementations • ICLR 2022 • Ruixiang Zhang, Shuangfei Zhai, Etai Littwin, Josh Susskind
We show that the low-rank approximation of NFKs derived from unsupervised generative models and supervised learning models gives rise to high-quality compact representations of data, achieving competitive results on a variety of machine learning tasks.
no code implementations • 29 Sep 2021 • Shuangfei Zhai, Walter Talbott, Nitish Srivastava, Chen Huang, Hanlin Goh, Ruixiang Zhang, Joshua M. Susskind
We introduce Dot Product Attention Free Transformer (DAFT), an efficient variant of Transformers \citep{transformer} that eliminates the query-key dot product in self attention.
Ranked #672 on Image Classification on ImageNet
7 code implementations • 28 May 2021 • Shuangfei Zhai, Walter Talbott, Nitish Srivastava, Chen Huang, Hanlin Goh, Ruixiang Zhang, Josh Susskind
We introduce Attention Free Transformer (AFT), an efficient variant of Transformers that eliminates the need for dot product self attention.
no code implementations • ICCV 2021 • Jianyun Xu, Ruixiang Zhang, Jian Dou, Yushi Zhu, Jie Sun, ShiLiang Pu
The voxel-based view is regular, but sparse, and computation grows cubically when voxel resolution increases.
Ranked #15 on Robust 3D Semantic Segmentation on SemanticKITTI-C
no code implementations • 19 Jan 2021 • Yifan Jing, Chieu-Minh Tran, Ruixiang Zhang
Henstock and Macbeath asked in 1953 whether the Brunn-Minkowski inequality can be generalized to nonabelian locally compact groups; questions along the same line were also asked by Hrushovski, McCrudden, and Tao.
Group Theory Classical Analysis and ODEs Combinatorics Functional Analysis Metric Geometry 22D05, 43A05, 49Q20, 60B15
1 code implementation • International Conference on Pattern Recognition (ICPR) 2021 • Jinwang Wang, Wen Yang, Haowen Guo, Ruixiang Zhang, Gui-Song Xia
To build a benchmark for tiny object detection in aerial images, we evaluate the state-of-the-art object detectors on our AI-TOD dataset.
Ranked #5 on Object Detection on AI-TOD
no code implementations • 21 Dec 2020 • Yifei Yang, Shibing Xiang, Ruixiang Zhang
Autoencoder and its variants have been widely applicated in anomaly detection. The previous work memory-augmented deep autoencoder proposed memorizing normality to detect anomaly, however it neglects the feature discrepancy between different resolution scales, therefore we introduce multi-scale memories to record scale-specific features and multi-scale attention fuser between the encoding and decoding module of the autoencoder for anomaly detection, namely MMAE. MMAE updates slots at corresponding resolution scale as prototype features during unsupervised learning.
no code implementations • ICML 2020 • Ruixiang Zhang, Masanori Koyama, katsuhiko Ishiguro
Learning controllable and generalizable representation of multivariate data with desired structural properties remains a fundamental problem in machine learning.
3 code implementations • NeurIPS 2020 • Tong Che, Ruixiang Zhang, Jascha Sohl-Dickstein, Hugo Larochelle, Liam Paull, Yuan Cao, Yoshua Bengio
To make that practical, we show that sampling from this modified density can be achieved by sampling in latent space according to an energy-based model induced by the sum of the latent prior log-density and the discriminator output score.
no code implementations • 18 Nov 2019 • Tong Che, Xiaofeng Liu, Site Li, Yubin Ge, Ruixiang Zhang, Caiming Xiong, Yoshua Bengio
We test the verifier network on out-of-distribution detection and adversarial example detection problems, as well as anomaly detection problems in structured prediction tasks such as image caption generation.
2 code implementations • ICML 2020 • Zijun Zhang, Ruixiang Zhang, Zongpeng Li, Yoshua Bengio, Liam Paull
We therefore propose to map both the generated and target distributions to a latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space.
no code implementations • NeurIPS 2018 • Ruixiang Zhang, Tong Che, Zoubin Ghahramani, Yoshua Bengio, Yangqiu Song
In this paper, we propose a conceptually simple and general framework called MetaGAN for few-shot learning problems.
1 code implementation • 30 Oct 2017 • Yao Ming, Shaozu Cao, Ruixiang Zhang, Zhen Li, Yuanzhe Chen, Yangqiu Song, Huamin Qu
We propose a technique to explain the function of individual hidden state units based on their expected response to input texts.
no code implementations • 26 Feb 2017 • Tong Che, Yan-ran Li, Ruixiang Zhang, R. Devon Hjelm, Wenjie Li, Yangqiu Song, Yoshua Bengio
Despite the successes in capturing continuous distributions, the application of generative adversarial networks (GANs) to discrete settings, like natural language tasks, is rather restricted.