Search Results for author: Zeyu Wang

Found 51 papers, 26 papers with code

Camera-Based Remote Physiology Sensing for Hundreds of Subjects Across Skin Tones

1 code implementation7 Apr 2024 Jiankai Tang, Xinyi Li, Jiacheng Liu, Xiyuxing Zhang, Zeyu Wang, Yuntao Wang

Remote photoplethysmography (rPPG) emerges as a promising method for non-invasive, convenient measurement of vital signs, utilizing the widespread presence of cameras.

SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

1 code implementation8 Mar 2024 Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, Zeyu Wang

We present SplattingAvatar, a hybrid 3D representation of photorealistic human avatars with Gaussian Splatting embedded on a triangle mesh, which renders over 300 FPS on a modern GPU and 30 FPS on a mobile device.

MemoNav: Working Memory Model for Visual Navigation

1 code implementation29 Feb 2024 Hongxin Li, Zeyu Wang, Xu Yang, Yuran Yang, Shuqi Mei, Zhaoxiang Zhang

Subsequently, a graph attention module encodes the retained STM and the LTM to generate working memory (WM) which contains the scene features essential for efficient navigation.

Decision Making Graph Attention +2

Multi-Human Mesh Recovery with Transformers

no code implementations26 Feb 2024 Zeyu Wang, Zhenzhen Weng, Serena Yeung-Levy

Conventional approaches to human mesh recovery predominantly employ a region-based strategy.

Human Mesh Recovery

BoNuS: Boundary Mining for Nuclei Segmentation with Partial Point Labels

1 code implementation15 Jan 2024 Yi Lin, Zeyu Wang, Dong Zhang, Kwang-Ting Cheng, Hao Chen

To alleviate this problem, in this paper, we propose a weakly-supervised nuclei segmentation method that only requires partial point labels of nuclei.

Multiple Instance Learning Segmentation

Revisiting Adversarial Training at Scale

1 code implementation9 Jan 2024 Zeyu Wang, Xianhang Li, Hongru Zhu, Cihang Xie

For example, by training on DataComp-1B dataset, our AdvXL empowers a vanilla ViT-g model to substantially surpass the previous records of $l_{\infty}$-, $l_{2}$-, and $l_{1}$-robust accuracy by margins of 11. 4%, 14. 2% and 12. 9%, respectively.

Multi-Modal Representation Learning for Molecular Property Prediction: Sequence, Graph, Geometry

1 code implementation7 Jan 2024 Zeyu Wang, Tianyi Jiang, Jinhuan Wang, Qi Xuan

Molecular property prediction refers to the task of labeling molecules with some biochemical properties, playing a pivotal role in the drug discovery and design process.

Data Augmentation Drug Discovery +4

A Robust Quantile Huber Loss With Interpretable Parameter Adjustment In Distributional Reinforcement Learning

1 code implementation4 Jan 2024 Parvin Malekzadeh, Konstantinos N. Plataniotis, Zissis Poulos, Zeyu Wang

Distributional Reinforcement Learning (RL) estimates return distribution mainly by learning quantile values via minimizing the quantile Huber loss function, entailing a threshold parameter often selected heuristically or via hyperparameter search, which may not generalize well and can be suboptimal.

Atari Games Distributional Reinforcement Learning +1

Voila-A: Aligning Vision-Language Models with User's Gaze Attention

no code implementations22 Dec 2023 Kun Yan, Lei Ji, Zeyu Wang, Yuntao Wang, Nan Duan, Shuai Ma

In this paper, we introduce gaze information, feasibly collected by AR or VR devices, as a proxy for human attention to guide VLMs and propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.

MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising

no code implementations18 Dec 2023 Bingyuan Wang, Hengyu Meng, Zeyu Cai, Lanjiong Li, Yue Ma, Qifeng Chen, Zeyu Wang

Visual storytelling often uses nontypical aspect-ratio images like scroll paintings, comic strips, and panoramas to create an expressive and compelling narrative.

Denoising Image Generation +1

Dietary Assessment with Multimodal ChatGPT: A Systematic Analysis

no code implementations14 Dec 2023 Frank P. -W. Lo, Jianing Qiu, Zeyu Wang, Junhong Chen, Bo Xiao, Wu Yuan, Stamatia Giannarou, Gary Frost, Benny Lo

Although artificial intelligence (AI)-based solutions have been devised to automate the dietary assessment process, these prior AI methodologies encounter challenges in their ability to generalize across a diverse range of food types, dietary behaviors, and cultural contexts.

Image Captioning Scene Understanding

Rejuvenating image-GPT as Strong Visual Representation Learners

1 code implementation4 Dec 2023 Sucheng Ren, Zeyu Wang, Hongru Zhu, Junfei Xiao, Alan Yuille, Cihang Xie

This paper enhances image-GPT (iGPT), one of the pioneering works that introduce autoregressive pretraining to predict next pixels for visual representation learning.

Representation Learning

Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training

1 code implementation3 Nov 2023 Yipeng Gao, Zeyu Wang, Wei-Shi Zheng, Cihang Xie, Yuyin Zhou

Contrastive learning has emerged as a promising paradigm for 3D open-world understanding, i. e., aligning point cloud representation to image and text embedding space individually.

 Ranked #1 on Zero-shot 3D classification on Objaverse LVIS (using extra training data)

Contrastive Learning Retrieval +3

FedConv: Enhancing Convolutional Neural Networks for Handling Data Heterogeneity in Federated Learning

1 code implementation6 Oct 2023 Peiran Xu, Zeyu Wang, Jieru Mei, Liangqiong Qu, Alan Yuille, Cihang Xie, Yuyin Zhou

Federated learning (FL) is an emerging paradigm in machine learning, where a shared model is collaboratively learned using data from multiple devices to mitigate the risk of data leakage.

Federated Learning

Few-Shot Multi-Label Aspect Category Detection Utilizing Prototypical Network with Sentence-Level Weighting and Label Augmentation

no code implementations27 Sep 2023 Zeyu Wang, Mizuho Iwaihara

Since aspect category detection often suffers from limited datasets and data sparsity, the prototypical network with attention mechanisms has been applied for few-shot aspect category detection.

Aspect Category Detection Sentence

DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation

1 code implementation ICCV 2023 Zeyu Wang, Dingwen Li, Chenxu Luo, Cihang Xie, Xiaodong Yang

In this work, we propose to boost the representation learning of a multi-camera BEV based student detector by training it to imitate the features of a well-trained LiDAR based teacher detector.

3D Object Detection Autonomous Driving +4

CLIPA-v2: Scaling CLIP Training with 81.1% Zero-shot ImageNet Accuracy within a \$10,000 Budget; An Extra \$4,000 Unlocks 81.8% Accuracy

2 code implementations27 Jun 2023 Xianhang Li, Zeyu Wang, Cihang Xie

The recent work CLIPA presents an inverse scaling law for CLIP training -- whereby the larger the image/text encoders used, the shorter the sequence length of image/text tokens that can be applied in training.

Subgraph Networks Based Contrastive Learning

no code implementations6 Jun 2023 Jinhuan Wang, Jiafei Shao, Zeyu Wang, Shanqing Yu, Qi Xuan, Xiaoniu Yang

In addition, we also investigate the impact of the second-order subgraph augmentation on mining graph structure interactions, and further, propose a contrastive objective that fuses the first-order and second-order subgraph information.

Attribute Contrastive Learning +3

ZeroAvatar: Zero-shot 3D Avatar Generation from a Single Image

no code implementations25 May 2023 Zhenzhen Weng, Zeyu Wang, Serena Yeung

Recent advancements in text-to-image generation have enabled significant progress in zero-shot 3D shape generation.

3D Shape Generation Image to 3D +1

SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification

3 code implementations16 May 2023 Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Zhengxin Zhang, Rae Ying Yee Wong, Alan Zhu, Lijie Yang, Xiaoxiang Shi, Chunan Shi, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia

Our evaluation shows that SpecInfer outperforms existing LLM serving systems by 1. 5-2. 8x for distributed LLM inference and by 2. 6-3. 5x for offloading-based LLM inference, while preserving the same generative performance.

Language Modelling Large Language Model

An Inverse Scaling Law for CLIP Training

1 code implementation NeurIPS 2023 Xianhang Li, Zeyu Wang, Cihang Xie

However, its associated training cost is prohibitively high, imposing a significant barrier to its widespread exploration.

Click-Feedback Retrieval

no code implementations28 Apr 2023 Zeyu Wang, Yu Wu

In this work, we study a setting where the feedback is provided through users clicking liked and disliked searching results.

Retrieval

A robust method for reliability updating with equality information using sequential adaptive importance sampling

no code implementations8 Mar 2023 Xiong Xiao, Zeyu Wang, Quanwang Li

Reliability updating refers to a problem that integrates Bayesian updating technique with structural reliability analysis and cannot be directly solved by structural reliability methods (SRMs) when it involves equality information.

Computational Efficiency

On the Adversarial Robustness of Camera-based 3D Object Detection

1 code implementation25 Jan 2023 Shaoyuan Xie, Zichao Li, Zeyu Wang, Cihang Xie

In recent years, camera-based 3D object detection has gained widespread attention for its ability to achieve high performance with low computational cost.

3D Object Detection Adversarial Attack +5

Thermal Infrared Image Inpainting via Edge-Aware Guidance

no code implementations28 Oct 2022 Zeyu Wang, Haibin Shen, Changyou Men, Quan Sun, Kejie Huang

In this paper, we propose a novel task -- Thermal Infrared Image Inpainting, which aims to reconstruct missing regions of TIR images.

Image Inpainting

Bag of Tricks for FGSM Adversarial Training

no code implementations6 Sep 2022 Zichao Li, Li Liu, Zeyu Wang, Yuyin Zhou, Cihang Xie

Adversarial training (AT) with samples generated by Fast Gradient Sign Method (FGSM), also known as FGSM-AT, is a computationally simple method to train robust networks.

Masked Autoencoders Enable Efficient Knowledge Distillers

1 code implementation CVPR 2023 Yutong Bai, Zeyu Wang, Junfei Xiao, Chen Wei, Huiyu Wang, Alan Yuille, Yuyin Zhou, Cihang Xie

For example, by distilling the knowledge from an MAE pre-trained ViT-L into a ViT-B, our method achieves 84. 0% ImageNet top-1 accuracy, outperforming the baseline of directly distilling a fine-tuned ViT-L by 1. 2%.

Knowledge Distillation

Predicting Word Learning in Children from the Performance of Computer Vision Systems

no code implementations7 Jul 2022 Sunayana Rane, Mira L. Nencheva, Zeyu Wang, Casey Lew-Williams, Olga Russakovsky, Thomas L. Griffiths

The performance of the computer vision systems is correlated with human judgments of the concreteness of words, which are in turn a predictor of children's word learning, suggesting that these models are capturing the relationship between words and visual phenomena.

Image Captioning

Can CNNs Be More Robust Than Transformers?

1 code implementation7 Jun 2022 Zeyu Wang, Yutong Bai, Yuyin Zhou, Cihang Xie

The recent success of Vision Transformers is shaking the long dominance of Convolutional Neural Networks (CNNs) in image recognition for a decade.

Gamma and Vega Hedging Using Deep Distributional Reinforcement Learning

1 code implementation10 May 2022 Jay Cao, Jacky Chen, Soroush Farghadani, John Hull, Zissis Poulos, Zeyu Wang, Jun Yuan

We show how D4PG can be used in conjunction with quantile regression to develop a hedging strategy for a trader responsible for derivatives that arrive stochastically and depend on a single underlying asset.

Distributional Reinforcement Learning Position +2

Multi-Query Video Retrieval

1 code implementation10 Jan 2022 Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky

Retrieving target videos based on text descriptions is a task of great practical value and has received increasing attention over the past few years.

Retrieval Video Retrieval

Human Activity Recognition Using 3D Orthogonally-projected EfficientNet on Radar Time-Range-Doppler Signature

no code implementations24 Nov 2021 Zeyu Wang, Chenglin Yao, Jianfeng Ren, Xudong Jiang

In radar activity recognition, 2D signal representations such as spectrogram, cepstrum and cadence velocity diagram are often utilized, while range information is often neglected.

Human Activity Recognition

Chi-square Loss for Softmax: an Echo of Neural Network Structure

no code implementations31 Aug 2021 Zeyu Wang, Meiqing Wang

In addition, we studied the sample distribution of this loss function by visualization and found that the distribution is related to the neural network structure, which is distinct compared to cross-entropy.

Semi-Supervised Crowd Counting from Unlabeled Data

no code implementations31 Aug 2021 Haoran Duan, Fan Wan, Rui Sun, Zeyu Wang, Varun Ojha, Yu Guan, Hubert P. H. Shum, Bingzhang Hu, Yang Long

Our method achieved competitive performance in semi-supervised learning approaches on these crowd counting datasets.

Crowd Counting

maplet: An extensible R toolbox for modular and reproducible omics pipelines

no code implementations6 May 2021 Kelsey Chetnik, Elisa Benedetti, Daniel P. Gomari, Annalise Schweickart, Richa Batra, Mustafa Buyukozkan, Zeyu Wang, Matthias Arnold, Jonas Zierer, Karsten Suhre, Jan Krumsiek

This paper presents maplet, an open-source R package for the creation of highly customizable, fully reproducible statistical pipelines for omics data analysis, with a special focus on metabolomics-based methods.

Data Visualization

Plot2API: Recommending Graphic API from Plot via Semantic Parsing Guided Neural Network

1 code implementation2 Apr 2021 Zeyu Wang, Sheng Huang, Zhongxin Liu, Meng Yan, Xin Xia, Bei Wang, Dan Yang

Considering the lack of technologies in Plot2API, we present a novel deep multi-task learning approach named Semantic Parsing Guided Neural Network (SPGNN) which translates the Plot2API issue as a multi-label image classification and an image semantic parsing tasks for the solution.

Data Augmentation Data Visualization +3

Towards Unique and Informative Captioning of Images

1 code implementation ECCV 2020 Zeyu Wang, Berthy Feng, Karthik Narasimhan, Olga Russakovsky

We find that modern captioning systems return higher likelihoods for incorrect distractor sentences compared to ground truth captions, and that evaluation metrics like SPICE can be 'topped' using simple captioning systems relying on object detectors.

Image Captioning Re-Ranking

REAK: Reliability analysis through Error rate-based Adaptive Kriging

no code implementations4 Feb 2020 Zeyu Wang, Abdollah Shafieezadeh

Reliability analysis for these systems when failure probabilities are small is significantly challenging, requiring a large number of costly simulations.

SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines

6 code implementations14 Nov 2019 Yinda Xu, Zeyu Wang, Zuoxin Li, Ye Yuan, Gang Yu

Following these guidelines, we design our Fully Convolutional Siamese tracker++ (SiamFC++) by introducing both classification and target state estimation branch(G1), classification score without ambiguity(G2), tracking without prior knowledge(G3), and estimation quality score(G4).

Ranked #2 on Visual Object Tracking on VOT2017/18 (using extra training data)

Classification General Classification +3

Self-supervised Learning of Detailed 3D Face Reconstruction

1 code implementation25 Oct 2019 Yajing Chen, Fanzi Wu, Zeyu Wang, Yibing Song, Yonggen Ling, Linchao Bao

The displacement map and the coarse model are used to render a final detailed face, which again can be compared with the original input image to serve as a photometric loss for the second stage.

3D Face Reconstruction Face Alignment +1

SCANN: Synthesis of Compact and Accurate Neural Networks

no code implementations19 Apr 2019 Shayan Hassantabar, Zeyu Wang, Niraj K. Jha

To address these challenges, we propose a two-step neural network synthesis methodology, called DR+SCANN, that combines two complementary approaches to design compact and accurate DNNs.

Dimensionality Reduction Neural Network Compression +1

Multi-band Weighted $l_p$ Norm Minimization for Image Denoising

no code implementations14 Jan 2019 Yanchi Su, Zhanshan Li, Haihong Yu, Zeyu Wang

Low rank matrix approximation (LRMA) has drawn increasing attention in recent years, due to its wide range of applications in computer vision and machine learning.

Image Denoising

A Data-driven Adversarial Examples Recognition Framework via Adversarial Feature Genome

no code implementations25 Dec 2018 Li Chen, Qi Li, Weiye Chen, Zeyu Wang, Haifeng Li

In this regard, we propose the Adversarial Feature Genome (AFG), a novel type of data that contains both the differences and features about classes.

General Classification Multi-Label Classification

AniCode: Authoring Coded Artifacts for Network-Free Personalized Animations

1 code implementation31 Jul 2018 Zeyu Wang, Shiyu Qiu, Qingyang Chen, Alexander Ringlein, Julie Dorsey, Holly Rushmeier

We introduce AniCode, a novel framework for authoring and consuming time-based media.

Graphics

Cannot find the paper you are looking for? You can Submit a new open access paper.