Search Results for author: Ze Liu

Found 17 papers, 12 papers with code

Syntactically Diverse Adversarial Network for Knowledge-Grounded Conversation Generation

no code implementations • Findings (EMNLP) 2021 • Fuwei Cui, Hui Di, Hongjie Ren, Kazushige Ouchi, Ze Liu, Jinan Xu

Generative conversation systems tend to produce meaningless and generic responses, which significantly reduce the user experience.

Informativeness

Paper
Add Code

F5C-finder: An Explainable and Ensemble Biological Language Model for Predicting 5-Formylcytidine Modifications on mRNA

no code implementations • 20 Apr 2024 • Guohao Wang, Ting Liu, Hongqiang Lyu, Ze Liu

The result highlights the effectiveness of biological language model in capturing both the order (sequential) and functional meaning (semantics) within genomes.

Ensemble Learning Language Modelling

Paper
Add Code

FP8-LM: Training FP8 Large Language Models

1 code implementation • 27 Oct 2023 • Houwen Peng, Kan Wu, Yixuan Wei, Guoshuai Zhao, Yuxiang Yang, Ze Liu, Yifan Xiong, Ziyue Yang, Bolin Ni, Jingcheng Hu, Ruihang Li, Miaosen Zhang, Chen Li, Jia Ning, Ruizhe Wang, Zheng Zhang, Shuguang Liu, Joe Chau, Han Hu, Peng Cheng

In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs).

459

Paper
Code

Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset

no code implementations • 8 Oct 2023 • Ze Liu

Given the growing computational complexity of these models and the scarcity of large, high-quality datasets, this research focuses on transfer learning, especially on few-shot, low-resource, and customized datasets.

Transfer Learning

Paper
Add Code

V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection

1 code implementation • 8 Aug 2023 • Yichao Shen, Zigang Geng, Yuhui Yuan, Yutong Lin, Ze Liu, Chunyu Wang, Han Hu, Nanning Zheng, Baining Guo

We introduce a highly performant 3D object detector for point clouds using the DETR framework.

Ranked #2 on 3D Object Detection on ScanNetV2

3D Object Detection Decoder +2

Paper
Code

Human Pose as Compositional Tokens

1 code implementation • CVPR 2023 • Zigang Geng, Chunyu Wang, Yixuan Wei, Ze Liu, Houqiang Li, Han Hu

Human pose is typically represented by a coordinate vector of body joints or their heatmap embeddings.

Ranked #1 on Pose Estimation on MPII Human Pose

Decoder Pose Estimation

262

Paper
Code

Improving CLIP Fine-tuning Performance

1 code implementation • ICCV 2023 • Yixuan Wei, Han Hu, Zhenda Xie, Ze Liu, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo

Experiments suggest that the feature map distillation approach significantly boosts the fine-tuning performance of CLIP models on several typical downstream vision tasks.

object-detection Object Detection +1

220

Paper
Code

Could Giant Pretrained Image Models Extract Universal Representations?

no code implementations • 3 Nov 2022 • Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao

In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition.

Ranked #3 on Action Recognition In Videos on Kinetics-400

Action Recognition In Videos Instance Segmentation +5

Paper
Add Code

Tutel: Adaptive Mixture-of-Experts at Scale

2 code implementations • 7 Jun 2022 • Changho Hwang, Wei Cui, Yifan Xiong, Ziyue Yang, Ze Liu, Han Hu, Zilong Wang, Rafael Salas, Jithin Jose, Prabhat Ram, Joe Chau, Peng Cheng, Fan Yang, Mao Yang, Yongqiang Xiong

On efficiency, Flex accelerates SwinV2-MoE, achieving up to 1. 55x and 2. 11x speedup in training and inference over Fairseq, respectively.

Object Detection

13,030

Paper
Code

Swin Transformer V2: Scaling Up Capacity and Resolution

19 code implementations • CVPR 2022 • Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo

Three main techniques are proposed: 1) a residual-post-norm method combined with cosine attention to improve training stability; 2) A log-spaced continuous position bias method to effectively transfer models pre-trained using low-resolution images to downstream tasks with high-resolution inputs; 3) A self-supervised pre-training method, SimMIM, to reduce the needs of vast labeled images.

Ranked #4 on Image Classification on ImageNet V2 (using extra training data)

Action Classification Image Classification +3

29,908

Paper
Code

Video Swin Transformer

14 code implementations • CVPR 2022 • Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, Han Hu

The vision community is witnessing a modeling shift from CNNs to Transformers, where pure Transformer architectures have attained top accuracy on the major video recognition benchmarks.

Ranked #28 on Action Classification on Kinetics-600 (using extra training data)

Action Classification Action Recognition +5

3,919

Paper
Code

Group-Free 3D Object Detection via Transformers

4 code implementations • ICCV 2021 • Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong

Instead of grouping local points to each object candidate, our method computes the feature of an object from all the points in the point cloud with the help of an attention mechanism in the Transformers \cite{vaswani2017attention}, where the contribution of each point is automatically learned in the network training.

Ranked #3 on 3D Object Detection on SUN-RGBD

3D Object Detection Object +1

240

Paper
Code

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

71 code implementations • ICCV 2021 • Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.

Ranked #2 on Image Classification on OmniBenchmark

Image Classification Instance Segmentation +3

125,725

Paper
Code

RGB-D Salient Object Detection via 3D Convolutional Neural Networks

1 code implementation • 25 Jan 2021 • Qian Chen, Ze Liu, Yi Zhang, Keren Fu, Qijun Zhao, Hongwei Du

The proposed model, named RD3D, aims at pre-fusion in the encoder stage and in-depth fusion in the decoder stage to effectively promote the full integration of RGB and depth streams.

Ranked #8 on Thermal Image Segmentation on RGB-T-Glass-Segmentation

Decoder object-detection +3

Paper
Code

Leveraging Batch Normalization for Vision Transformers

no code implementations • ICCVW 2021 • Zhuliang Yao, Yue Cao, Yutong Lin, Ze Liu, Zheng Zhang, Han Hu

Transformer-based vision architectures have attracted great attention because of the strong performance over the convolutional neural networks (CNNs).

Paper
Add Code

EF-Net: A novel enhancement and fusion network for RGB-D saliency detection

1 code implementation • 4 Nov 2020 • Qian Chen, Keren Fu, Ze Liu, Geng Chen, Hongwei Du, Bensheng Qiu, LingShao

Finally, we propose an effective layer-wise aggregation module to fuse the features extracted from the enhanced depth maps and RGB images for the accurate detection of salient objects.

object-detection Object Detection +2

Paper
Code

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

1 code implementation • ECCV 2020 • Ze Liu, Han Hu, Yue Cao, Zheng Zhang, Xin Tong

Our investigation reveals that despite the different designs of these operators, all of these operators make surprisingly similar contributions to the network performance under the same network input and feature numbers and result in the state-of-the-art accuracy on standard benchmarks.

Ranked #4 on 3D Semantic Segmentation on PartNet

3D Semantic Segmentation

251

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.