Search Results for author: Ruiping Wang

Found 37 papers, 10 papers with code

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

4 code implementations • 27 Feb 2024 • Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei

Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs).

Paper
Code

Glance and Focus: Memory Prompting for Multi-Event Video Question Answering

1 code implementation • NeurIPS 2023 • Ziyi Bai, Ruiping Wang, Xilin Chen

Instead of that, we train an Encoder-Decoder to generate a set of dynamic event memories at the glancing stage.

Ranked #1 on Video Question Answering on AGQA 2.0 balanced

Action Detection Human-Object Interaction Detection +2

Paper
Code

BitNet: Scaling 1-bit Transformers for Large Language Models

2 code implementations • 17 Oct 2023 • Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei

The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption.

Language Modelling Quantization

236

Paper
Code

From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection

1 code implementation • Winter Conference on Applications of Computer Vision (WACV) 2022 • Hui Nie, Ruiping Wang, Xilin Chen

Zero-Shot Detection (ZSD), which aims at localizing andrecognizing unseen objects in a complicated scene, usuallyleverages the visual and semantic information of individ-ual objects alone.

Ranked #6 on Generalized Zero-Shot Object Detection on MS-COCO

Generalized Zero-Shot Object Detection Scene Understanding +1

Paper
Code

SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot Learning

1 code implementation • 8 Nov 2021 • Fengyuan Yang, Ruiping Wang, Xilin Chen

However, human can learn new classes quickly even given few samples since human can tell what discriminative features should be focused on about each category based on both the visual and semantic prior knowledge.

feature selection Few-Shot Learning

Paper
Code

FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation

no code implementations • CVPR 2021 • Sijin Wang, Ziwei Yao, Ruiping Wang, Zhongqin Wu, Xilin Chen

Then for evaluating the adequacy of the candidate caption, it highlights the image gist on the visual scene graph under the guidance of the reference captions.

Image Captioning

Paper
Add Code

FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery

no code implementations • 9 Mar 2021 • Xian Sun, Peijin Wang, Zhiyuan Yan, Feng Xu, Ruiping Wang, Wenhui Diao, Jin Chen, Jihao Li, Yingchao Feng, Tao Xu, Martin Weinmann, Stefan Hinz, Cheng Wang, Kun fu

In this paper, we propose a novel benchmark dataset with more than 1 million instances and more than 15, 000 images for Fine-grAined object recognItion in high-Resolution remote sensing imagery which is named as FAIR1M.

Object object-detection +2

Paper
Add Code

Topic Scene Graph Generation by Attention Distillation From Caption

no code implementations • ICCV 2021 • Wenbin Wang, Ruiping Wang, Xilin Chen

To this end, we let the scene graph borrow the ability from the image caption so that it can be a specialist on the basis of remaining all-around, resulting in the so-called Topic Scene Graph.

Caption Generation Graph Generation +1

Paper
Add Code

Env-QA: A Video Question Answering Benchmark for Comprehensive Understanding of Dynamic Environments

no code implementations • ICCV 2021 • Difei Gao, Ruiping Wang, Ziyi Bai, Xilin Chen

Visual understanding goes well beyond the study of images or videos on the web.

Question Answering Video Question Answering

Paper
Add Code

Holistic Pose Graph: Modeling Geometric Structure Among Objects in a Scene Using Graph Inference for 3D Object Prediction

no code implementations • ICCV 2021 • Jiwei Xiao, Ruiping Wang, Xilin Chen

The inference of the HPG uses GRU to encode the pose features from their corresponding regions in a single RGB image, and passes messages along the graph structure iteratively to improve the predicted poses.

Object Pose Estimation

Paper
Add Code

CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions

1 code implementation • 14 Sep 2020 • Vincenzo Lomonaco, Lorenzo Pellegrini, Pau Rodriguez, Massimo Caccia, Qi She, Yu Chen, Quentin Jodelet, Ruiping Wang, Zheda Mai, David Vazquez, German I. Parisi, Nikhil Churamani, Marc Pickett, Issam Laradji, Davide Maltoni

In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous.

Benchmarking Continual Learning

Paper
Code

Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation

1 code implementation • ECCV 2020 • Wenbin Wang, Ruiping Wang, Shiguang Shan, Xilin Chen

Scene graph aims to faithfully reveal humans' perception of image content.

Graph Generation Scene Graph Generation

Paper
Code

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

1 code implementation • CVPR 2020 • Difei Gao, Ke Li, Ruiping Wang, Shiguang Shan, Xilin Chen

Then, we introduce three aggregators which guide the message passing from one graph to another to utilize the contexts in various modalities, so as to refine the features of nodes.

Question Answering Visual Question Answering (VQA)

Paper
Code

Deep Heterogeneous Hashing for Face Video Retrieval

no code implementations • 4 Nov 2019 • Shishi Qiao, Ruiping Wang, Shiguang Shan, Xilin Chen

To tackle the key challenge of hashing on the manifold, a well-studied Riemannian kernel mapping is employed to project data (i. e. covariance matrices) into Euclidean space and thus enables to embed the two heterogeneous representations into a common Hamming space, where both intra-space discriminability and inter-space compatibility are considered.

Retrieval Video Retrieval

Paper
Add Code

Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval

1 code implementation • 11 Oct 2019 • Sijin Wang, Ruiping Wang, Ziwei Yao, Shiguang Shan, Xilin Chen

In the light of recent success of scene graph in many CV and NLP tasks for describing complex natural scenes, we propose to represent image and text with two kinds of scene graphs: visual scene graph (VSG) and textual scene graph (TSG), each of which is exploited to jointly characterize objects and relationships in the corresponding modality.

Graph Matching Retrieval +1

Paper
Code

Hierarchical Disentangle Network for Object Representation Learning

no code implementations • 25 Sep 2019 • Shishi Qiao, Ruiping Wang, Shiguang Shan, Xilin Chen

In this paper, we propose the hierarchical disentangle network (HDN) to exploit the rich hierarchical characteristics among categories to divide the disentangling process in a coarse-to-fine manner, such that each level only focuses on learning the specific representations in its granularity and finally the common and unique representations in all granularities jointly constitute the raw object.

Decoder Disentanglement +2

Paper
Add Code

Transferable Contrastive Network for Generalized Zero-Shot Learning

no code implementations • ICCV 2019 • Huajie Jiang, Ruiping Wang, Shiguang Shan, Xilin Chen

Zero-shot learning (ZSL) is a challenging problem that aims to recognize the target categories without seen data, where semantic information is leveraged to transfer knowledge from some source classes.

Ranked #6 on Zero-Shot Learning on SUN Attribute

Generalized Zero-Shot Learning Transfer Learning

Paper
Add Code

CRIC: A VQA Dataset for Compositional Reasoning on Vision and Commonsense

no code implementations • 8 Aug 2019 • Difei Gao, Ruiping Wang, Shiguang Shan, Xilin Chen

To comprehensively evaluate such abilities, we propose a VQA benchmark, CRIC, which introduces new types of questions about Compositional Reasoning on vIsion and Commonsense, and an evaluation metric integrating the correctness of answering and commonsense grounding.

Question Answering Visual Question Answering (VQA)

Paper
Add Code

WIDER Face and Pedestrian Challenge 2018: Methods and Results

no code implementations • 19 Feb 2019 • Chen Change Loy, Dahua Lin, Wanli Ouyang, Yuanjun Xiong, Shuo Yang, Qingqiu Huang, Dongzhan Zhou, Wei Xia, Quanquan Li, Ping Luo, Junjie Yan, Jian-Feng Wang, Zuoxin Li, Ye Yuan, Boxun Li, Shuai Shao, Gang Yu, Fangyun Wei, Xiang Ming, Dong Chen, Shifeng Zhang, Cheng Chi, Zhen Lei, Stan Z. Li, Hongkai Zhang, Bingpeng Ma, Hong Chang, Shiguang Shan, Xilin Chen, Wu Liu, Boyan Zhou, Huaxiong Li, Peng Cheng, Tao Mei, Artem Kukharenko, Artem Vasenin, Nikolay Sergievskiy, Hua Yang, Liangqi Li, Qiling Xu, Yuan Hong, Lin Chen, Mingjun Sun, Yirong Mao, Shiying Luo, Yongjun Li, Ruiping Wang, Qiaokang Xie, Ziyang Wu, Lei Lu, Yiheng Liu, Wengang Zhou

This paper presents a review of the 2018 WIDER Challenge on Face and Pedestrian.

Face Detection Pedestrian Detection +2

Paper
Add Code

Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition

no code implementations • ECCV 2018 • Huajie Jiang, Ruiping Wang, Shiguang Shan, Xilin Chen

Zero-shot learning (ZSL) aims to recognize objects of novel classes without any training samples of specific classes, which is achieved by exploiting the semantic information and auxiliary datasets.

Dictionary Learning Zero-Shot Learning

Paper
Add Code

Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships

no code implementations • CVPR 2018 • Yong Liu, Ruiping Wang, Shiguang Shan, Xilin Chen

Context is important for accurate visual recognition.

Object object-detection +1

Paper
Add Code

Learning Discriminative Latent Attributes for Zero-Shot Classification

no code implementations • ICCV 2017 • Huajie Jiang, Ruiping Wang, Shiguang Shan, Yi Yang, Xilin Chen

Zero-shot learning (ZSL) aims to transfer knowledge from observed classes to the unseen classes, based on the assumption that both the seen and unseen classes share a common semantic space, among which attributes enjoy a great popularity.

Attribute Classification +3

Paper
Add Code

Discriminative Covariance Oriented Representation Learning for Face Recognition With Image Sets

no code implementations • CVPR 2017 • Wen Wang, Ruiping Wang, Shiguang Shan, Xilin Chen

For face recognition with image sets, while most existing works mainly focus on building robust set models with hand-crafted feature, it remains a research gap to learn better image representations which can closely match the subsequent image set modeling and classification.

Face Recognition General Classification +2

Paper
Add Code

Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks

no code implementations • CVPR 2017 • Haomiao Liu, Ruiping Wang, Shiguang Shan, Xilin Chen

In this paper we propose a unified framework to address multiple realistic image retrieval tasks concerning both category and attributes.

Attribute Image Retrieval +1

Paper
Add Code

Geometry-aware Similarity Learning on SPD Manifolds for Visual Recognition

no code implementations • 17 Aug 2016 • Zhiwu Huang, Ruiping Wang, Xianqiu Li, Wenxian Liu, Shiguang Shan, Luc van Gool, Xilin Chen

Specifically, by exploiting the Riemannian geometry of the manifold of fixed-rank Positive Semidefinite (PSD) matrices, we present a new solution to reduce optimizing over the space of column full-rank transformation matrices to optimizing on the PSD manifold which has a well-established Riemannian structure.

Paper
Add Code

Cross Euclidean-to-Riemannian Metric Learning with Application to Face Recognition from Video

no code implementations • 15 Aug 2016 • Zhiwu Huang, Ruiping Wang, Shiguang Shan, Luc van Gool, Xilin Chen

With this mapping, the problem of learning a cross-view metric between the two source heterogeneous spaces can be expressed as learning a single-view Euclidean distance metric in the target common Euclidean space.

Face Recognition Metric Learning

Paper
Add Code

Dual Purpose Hashing

no code implementations • 19 Jul 2016 • Haomiao Liu, Ruiping Wang, Shiguang Shan, Xilin Chen

Recent years have seen more and more demand for a unified framework to address multiple realistic image retrieval tasks concerning both category and attributes.

Attribute Image Retrieval +1

Paper
Add Code

Deep Supervised Hashing for Fast Image Retrieval

1 code implementation • CVPR 2016 • Haomiao Liu, Ruiping Wang, Shiguang Shan, Xilin Chen

In this paper, we present a new hashing method to learn compact binary codes for highly efficient image retrieval on large-scale datasets.

Ranked #1 on Image Retrieval on CIFAR-10

Image Retrieval Retrieval

Paper
Code

Two Birds, One Stone: Jointly Learning Binary Code for Large-Scale Face Image Retrieval and Attributes Prediction

no code implementations • ICCV 2015 • Yan Li, Ruiping Wang, Haomiao Liu, Huajie Jiang, Shiguang Shan, Xilin Chen

In this way, the learned binary codes can be applied to not only fine-grained face image retrieval, but also facial attributes prediction, which is the very innovation of this work, just like killing two birds with one stone.

Face Image Retrieval Retrieval

Paper
Add Code

Learning Mid-level Words on Riemannian Manifold for Action Recognition

no code implementations • 16 Nov 2015 • Mengyi Liu, Ruiping Wang, Shiguang Shan, Xilin Chen

Human action recognition remains a challenging task due to the various sources of video data and large intra-class variations.

Action Recognition Clustering +1

Paper
Add Code

Learning Expressionlets via Universal Manifold Model for Dynamic Facial Expression Recognition

no code implementations • 16 Nov 2015 • Mengyi Liu, Shiguang Shan, Ruiping Wang, Xilin Chen

3) the local modes on each STM can be instantiated by fitting to UMM, and the corresponding expressionlet is constructed by modeling the variations in each local mode.

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Add Code

Face Video Retrieval With Image Query via Hashing Across Euclidean Space and Riemannian Manifold

no code implementations • CVPR 2015 • Yan Li, Ruiping Wang, Zhiwu Huang, Shiguang Shan, Xilin Chen

Retrieving videos of a specific person given his/her face image as query becomes more and more appealing for applications like smart movie fast-forwards and suspect searching.

Retrieval Video Retrieval

Paper
Add Code

Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition With Image Sets

no code implementations • CVPR 2015 • Wen Wang, Ruiping Wang, Zhiwu Huang, Shiguang Shan, Xilin Chen

This paper presents a method named Discriminant Analysis on Riemannian manifold of Gaussian distributions (DARG) to solve the problem of face recognition with image sets.

Face Identification Face Recognition +1

Paper
Add Code

Projection Metric Learning on Grassmann Manifold With Application to Video Based Face Recognition

no code implementations • CVPR 2015 • Zhiwu Huang, Ruiping Wang, Shiguang Shan, Xilin Chen

In video based face recognition, great success has been made by representing videos as linear subspaces, which typically lie in a special type of non-Euclidean space known as Grassmann manifold.

Dimensionality Reduction Face Recognition +1

Paper
Add Code

A new hierarchical method for inter-patient heartbeat classification using random projections and RR intervals

no code implementations • BioMedical Engineering OnLine 2014 • Huifang Huang, Jie Liu, Qiang Zhu, Ruiping Wang, Guangshu Hu

This was done in order to improve the classification performance of these two classes of heartbeats by using different features and classification methods.

Ranked #2 on Heartbeat Classification on MIT-BIH AR

Classification General Classification +1

Paper
Add Code

Learning Euclidean-to-Riemannian Metric for Point-to-Set Classification

no code implementations • CVPR 2014 • Zhiwu Huang, Ruiping Wang, Shiguang Shan, Xilin Chen

Since the points commonly lie in Euclidean space while the sets are typically modeled as elements on Riemannian manifold, they can be treated as Euclidean points and Riemannian points respectively.

Classification General Classification +1

Paper
Add Code

Learning Expressionlets on Spatio-Temporal Manifold for Dynamic Facial Expression Recognition

no code implementations • CVPR 2014 • Mengyi Liu, Shiguang Shan, Ruiping Wang, Xilin Chen

In this paper, we attempt to solve both problems via manifold modeling of videos based on a novel mid-level representation, i. e. expressionlet.

Dynamic Facial Expression Recognition Facial Expression Recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.