Search Results for author: Yizhou Wang

Found 121 papers, 43 papers with code

Don't Judge by the Look: A Motion Coherent Augmentation for Video Recognition

1 code implementation14 Mar 2024 Yitian Zhang, Yue Bai, Huan Wang, Yizhou Wang, Yun Fu

Current training pipelines in object recognition neglect Hue Jittering when doing data augmentation as it not only brings appearance changes that are detrimental to classification, but also the implementation is inefficient in practice.

Data Augmentation Object Recognition +1

Efficient Action Counting with Dynamic Queries

1 code implementation3 Mar 2024 Zishi Li, Xiaoxuan Ma, Qiuyan Shang, Wentao Zhu, Hai Ci, Yu Qiao, Yizhou Wang

Temporal repetition counting aims to quantify the repeated action cycles within a video.

Contrastive Learning

Language Models Represent Beliefs of Self and Others

no code implementations28 Feb 2024 Wentao Zhu, Zhining Zhang, Yizhou Wang

Understanding and attributing mental states, known as Theory of Mind (ToM), emerges as a fundamental capability for human social reasoning.

Causal Inference

Real-time Holistic Robot Pose Estimation with Unknown States

1 code implementation8 Feb 2024 Shikun Ban, Juling Fan, Wentao Zhu, Xiaoxuan Ma, Yu Qiao, Yizhou Wang

We propose an end-to-end pipeline for real-time, holistic robot pose estimation from a single RGB image, even in the absence of known robot states.

6D Pose Estimation using RGB Robot Pose Estimation

Fast Peer Adaptation with Context-aware Exploration

no code implementations4 Feb 2024 Long Ma, Yuanfei Wang, Fangwei Zhong, Song-Chun Zhu, Yizhou Wang

To do so, it is crucial for the agent to efficiently probe and identify the peer's strategy, as this is the prerequisite for carrying out the best response in adaptation.

Using LLM such as ChatGPT for Designing and Implementing a RISC Processor: Execution,Challenges and Limitations

1 code implementation18 Jan 2024 Shadeeb Hossain, Aayush Gohil, Yizhou Wang

This paper discusses the feasibility of using Large Language Models LLM for code generation with a particular application in designing an RISC.

Code Generation

VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

no code implementations4 Dec 2023 Yizhou Wang, Ruiyi Zhang, Haoliang Wang, Uttaran Bhattacharya, Yun Fu, Gang Wu

Recent advancements in language-model-based video understanding have been progressing at a remarkable pace, spurred by the introduction of Large Language Models (LLMs).

Language Modelling Question Answering +2

Hulk: A Universal Knowledge Translator for Human-Centric Tasks

1 code implementation4 Dec 2023 Yizhou Wang, Yixuan Wu, Shixiang Tang, Weizhen He, Xun Guo, Feng Zhu, Lei Bai, Rui Zhao, Jian Wu, Tong He, Wanli Ouyang

Human-centric perception tasks, e. g., pedestrian detection, skeleton-based action recognition, and pose estimation, have wide industrial applications, such as metaverse and sports analysis.

3D Human Pose Estimation Action Recognition +8

Vision meets mmWave Radar: 3D Object Perception Benchmark for Autonomous Driving

no code implementations17 Nov 2023 Yizhou Wang, Jen-Hao Cheng, Jui-Te Huang, Sheng-Yao Kuan, Qiqian Fu, Chiming Ni, Shengyu Hao, Gaoang Wang, Guanbin Xing, Hui Liu, Jenq-Neng Hwang

This kind of radar format can enable machine learning models to generate more reliable object perception results after interacting and fusing the information or features between the camera and radar.

Autonomous Driving Sensor Fusion

AI Alignment: A Comprehensive Survey

no code implementations30 Oct 2023 Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen Mcaleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao

The former aims to make AI systems aligned via alignment training, while the latter aims to gain evidence about the systems' alignment and govern them appropriately to avoid exacerbating misalignment risks.

ChimpACT: A Longitudinal Dataset for Understanding Chimpanzee Behaviors

1 code implementation NeurIPS 2023 Xiaoxuan Ma, Stephan P. Kaufhold, Jiajun Su, Wentao Zhu, Jack Terwilliger, Andres Meza, Yixin Zhu, Federico Rossano, Yizhou Wang

ChimpACT is both comprehensive and challenging, consisting of 163 videos with a cumulative 160, 500 frames, each richly annotated with detection, identification, pose estimation, and fine-grained spatiotemporal behavior labels.

Action Detection Pose Estimation

Safe RLHF: Safe Reinforcement Learning from Human Feedback

1 code implementation19 Oct 2023 Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang

However, the inherent tension between the objectives of helpfulness and harmlessness presents a significant challenge during LLM training.

reinforcement-learning Safe Reinforcement Learning

Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World

1 code implementation16 Oct 2023 Rujie Wu, Xiaojian Ma, Zhenliang Zhang, Wei Wang, Qing Li, Song-Chun Zhu, Yizhou Wang

We even conceived a neuro-symbolic reasoning approach that reconciles LLMs & VLMs with logical reasoning to emulate the human problem-solving process for Bongard Problems.

Few-Shot Learning Logical Reasoning +1

Exploring Counterfactual Alignment Loss towards Human-centered AI

no code implementations3 Oct 2023 Mingzhou Liu, Xinwei Sun, Ching-Wen Lee, Yu Qiao, Yizhou Wang

In particular, we utilize the counterfactual generation's ability for causal attribution to introduce a novel loss called the CounterFactual Alignment (CF-Align) loss.

Attribute counterfactual +1

Human Motion Generation: A Survey

no code implementations20 Jul 2023 Wentao Zhu, Xiaoxuan Ma, Dongwoo Ro, Hai Ci, Jinlu Zhang, Jiaxin Shi, Feng Gao, Qi Tian, Yizhou Wang

In this survey, we present a comprehensive literature review of human motion generation, which, to the best of our knowledge, is the first of its kind in this field.

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

1 code implementation13 Jun 2023 Weizhen He, Yiheng Deng, Shixiang Tang, Qihao Chen, Qingsong Xie, Yizhou Wang, Lei Bai, Feng Zhu, Rui Zhao, Wanli Ouyang, Donglian Qi, Yunfeng Yan

This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions.

Person Re-Identification

Causal Discovery from Subsampled Time Series with Proxy Variables

1 code implementation NeurIPS 2023 Mingzhou Liu, Xinwei Sun, Lingjing Hu, Yizhou Wang

Based on these, we can leverage the proxies to remove the bias induced by the hidden variables and hence achieve identifiability.

Causal Discovery Causal Identification +1

Causal Discovery with Unobserved Variables: A Proxy Variable Approach

1 code implementation9 May 2023 Mingzhou Liu, Xinwei Sun, Yu Qiao, Yizhou Wang

Our observation is that discretizing continuous variables can can lead to serious errors and comprise the power of the proxy.

Causal Discovery Causal Identification

RSPT: Reconstruct Surroundings and Predict Trajectories for Generalizable Active Object Tracking

no code implementations7 Apr 2023 Fangwei Zhong, Xiao Bi, Yudi Zhang, Wei zhang, Yizhou Wang

However, building a generalizable active tracker that works robustly across different scenarios remains a challenge, especially in unstructured environments with cluttered obstacles and diverse layouts.

Autonomous Driving Object Tracking

3D Human Mesh Estimation from Virtual Markers

1 code implementation CVPR 2023 Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Wentao Zhu, Yizhou Wang

The advanced motion capture systems solve the problem by placing dense physical markers on the body surface, which allows to extract realistic meshes from their non-rigid motions.

3D Human Pose Estimation 3D Pose Estimation

HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining

1 code implementation CVPR 2023 Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang

Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.

 Ranked #1 on Pedestrian Attribute Recognition on PA-100K (using extra training data)

Attribute Autonomous Driving +5

Proactive Multi-Camera Collaboration For 3D Human Pose Estimation

no code implementations7 Mar 2023 Hai Ci, Mickel Liu, Xuehai Pan, Fangwei Zhong, Yizhou Wang

This paper presents a multi-agent reinforcement learning (MARL) scheme for proactive Multi-Camera Collaboration in 3D Human Pose Estimation in dynamic human crowds.

3D Human Pose Estimation 3D Reconstruction +1

UniHCP: A Unified Model for Human-Centric Perceptions

1 code implementation CVPR 2023 Yuanzheng Ci, Yizhou Wang, Meilin Chen, Shixiang Tang, Lei Bai, Feng Zhu, Rui Zhao, Fengwei Yu, Donglian Qi, Wanli Ouyang

When adapted to a specific task, UniHCP achieves new SOTAs on a wide range of human-centric tasks, e. g., 69. 8 mIoU on CIHP for human parsing, 86. 18 mA on PA-100K for attribute prediction, 90. 3 mAP on Market1501 for ReID, and 85. 8 JI on CrowdHuman for pedestrian detection, performing better than specialized models tailored for each task.

2D Pose Estimation Attribute +8

Saliency Guided Contrastive Learning on Scene Images

no code implementations22 Feb 2023 Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Haiyang Yang, Lei Bai, Rui Zhao, Donglian Qi, Wanli Ouyang

Despite being feasible, recent works largely overlooked discovering the most discriminative regions for contrastive learning to object representations in scene images.

Contrastive Learning Representation Learning +1

Explainable Anomaly Detection in Images and Videos: A Survey

1 code implementation13 Feb 2023 Yizhou Wang, Dongliang Guo, Sheng Li, Octavia Camps, Yun Fu

This paper provides the first survey concentrated on explainable visual anomaly detection methods.

Anomaly Detection

Making Reconstruction-based Method Great Again for Video Anomaly Detection

1 code implementation28 Jan 2023 Yizhou Wang, Can Qin, Yue Bai, Yi Xu, Xu Ma, Yun Fu

With the same perturbation magnitude, the testing reconstruction error of the normal frames lowers more than that of the abnormal frames, which contributes to mitigating the overfitting problem of reconstruction.

Anomaly Detection Optical Flow Estimation +1

GFPose: Learning 3D Human Pose Prior with Gradient Fields

1 code implementation CVPR 2023 Hai Ci, Mingdong Wu, Wentao Zhu, Xiaoxuan Ma, Hao Dong, Fangwei Zhong, Yizhou Wang

During the denoising process, GFPose implicitly incorporates pose priors in gradients and unifies various discriminative and generative tasks in an elegant framework.

Denoising Monocular 3D Human Pose Estimation +1

MotionBERT: A Unified Perspective on Learning Human Motion Representations

1 code implementation ICCV 2023 Wentao Zhu, Xiaoxuan Ma, Zhaoyang Liu, Libin Liu, Wayne Wu, Yizhou Wang

We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.

 Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (using extra training data)

3D Pose Estimation Action Recognition +3

One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement

no code implementations31 Jul 2022 Zihao Yin, Ping Gong, Chunyu Wang, Yizhou Yu, Yizhou Wang

As an important upstream task for many medical applications, supervised landmark localization still requires non-negligible annotation costs to achieve desirable performance.

Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

1 code implementation22 Jul 2022 Hang Ye, Wentao Zhu, Chunyu Wang, Rujie Wu, Yizhou Wang

While the voxel-based methods have achieved promising results for multi-person 3D pose estimation from multi-cameras, they suffer from heavy computation burdens, especially for large scenes.

3D Multi-Person Pose Estimation 3D Pose Estimation

VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data

1 code implementation20 Jul 2022 Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, Yizhou Wang

While monocular 3D pose estimation seems to have achieved very accurate results on the public datasets, their generalization ability is largely overlooked.

3D Multi-Person Pose Estimation (absolute) 3D Pose Estimation

GaitTAKE: Gait Recognition by Temporal Attention and Keypoint-guided Embedding

no code implementations7 Jul 2022 Hung-Min Hsu, Yizhou Wang, Cheng-Yen Yang, Jenq-Neng Hwang, Hoang Le Uyen Thuc, Kwang-Ju Kim

Gait recognition, which refers to the recognition or identification of a person based on their body shape and walking styles, derived from video data captured from a distance, is widely used in crime prevention, forensic identification, and social security.

Gait Recognition

Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains

no code implementations10 May 2022 Haiyang Yang, Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang

While recent self-supervised learning methods have achieved good performances with evaluation set on the same domain as the training set, they will have an undesirable performance decrease when tested on a different domain.

Self-Supervised Learning

Domain Invariant Model with Graph Convolutional Network for Mammogram Classification

no code implementations21 Apr 2022 Churan Wang, Jing Li, Xinwei Sun, Fandong Zhang, Yizhou Yu, Yizhou Wang

To resolve this problem, we propose a novel framework, namely Domain Invariant Model with Graph Convolutional Network (DIM-GCN), which only exploits invariant disease-related features from multiple domains.

Classification

Causal Intervention for Subject-Deconfounded Facial Action Unit Recognition

no code implementations17 Apr 2022 Yingjie Chen, Diqi Chen, Tao Wang, Yizhou Wang, Yun Liang

Subject-invariant facial action unit (AU) recognition remains challenging for the reason that the data distribution varies among subjects.

Causal Inference Facial Action Unit Detection

Adaptive Trajectory Prediction via Transferable GNN

no code implementations CVPR 2022 Yi Xu, Lichen Wang, Yizhou Wang, Yun Fu

To the best of our knowledge, our work is the pioneer which fills the gap in benchmarks and techniques for practical pedestrian trajectory prediction across different domains.

Autonomous Driving Pedestrian Trajectory Prediction +2

GraspARL: Dynamic Grasping via Adversarial Reinforcement Learning

no code implementations4 Mar 2022 Tianhao Wu, Fangwei Zhong, Yiran Geng, Hongchen Wang, Yongjian Zhu, Yizhou Wang, Hao Dong

we formulate the dynamic grasping problem as a 'move-and-grasp' game, where the robot is to pick up the object on the mover and the adversarial mover is to find a path to escape it.

Object reinforcement-learning +1

MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks

no code implementations19 Dec 2021 Wentao Zhu, Zhuoqian Yang, Ziang Di, Wayne Wu, Yizhou Wang, Chen Change Loy

Trained with the canonicalization operations and the derived regularizations, our method learns to factorize a skeleton sequence into three independent semantic subspaces, i. e., motion, structure, and view angle.

3D Reconstruction Action Analysis +2

SLA$^2$P: Self-supervised Anomaly Detection with Adversarial Perturbation

1 code implementation25 Nov 2021 Yizhou Wang, Can Qin, Rongzhe Wei, Yi Xu, Yue Bai, Yun Fu

Next we add adversarial perturbation to the transformed features to decrease their softmax scores of the predicted labels and design anomaly scores based on the predictive uncertainties of the classifier on these perturbed features.

Pseudo Label Self-Supervised Anomaly Detection +3

ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind

1 code implementation NeurIPS 2021 Yuanfei Wang, Fangwei Zhong, Jing Xu, Yizhou Wang

With ToM, each agent is capable of inferring the mental states and intentions of others according to its (local) observation.

Symmetry-Enhanced Attention Network for Acute Ischemic Infarct Segmentation with Non-Contrast CT Images

1 code implementation11 Oct 2021 Kongming Liang, Kai Han, Xiuli Li, Xiaoqing Cheng, Yiming Li, Yizhou Wang, Yizhou Yu

In this paper, we propose a symmetry enhanced attention network (SEAN) for acute ischemic infarct segmentation.

Context-LGM: Leveraging Object-Context Relation for Context-Aware Object Recognition

no code implementations8 Oct 2021 Mingzhou Liu, Xinwei Sun, Fandong Zhang, Yizhou Yu, Yizhou Wang

Finally, to implement this contextual posterior, we introduce a Transformer that takes the object's information as a reference and locates correlated contextual factors.

Emotion Recognition Object +2

MemREIN: Rein the Domain Shift for Cross-Domain Few-Shot Learning

no code implementations29 Sep 2021 Yi Xu, Lichen Wang, Yizhou Wang, Can Qin, Yulun Zhang, Yun Fu

In this paper, we propose a novel framework, MemREIN, which considers Memorized, Restitution, and Instance Normalization for cross-domain few-shot learning.

Contrastive Learning cross-domain few-shot learning

ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot

no code implementations ICCV 2021 Jiarui Cai, Yizhou Wang, Jenq-Neng Hwang

One-stage long-tailed recognition methods improve the overall performance in a "seesaw" manner, i. e., either sacrifice the head's accuracy for better tail classification or elevate the head's accuracy even higher but ignore the tail.

Long-tail Learning

Which Invariance Should We Transfer? A Causal Minimax Learning Approach

1 code implementation5 Jul 2021 Mingzhou Liu, Xiangyu Zheng, Xinwei Sun, Fang Fang, Yizhou Wang

When this condition fails, we surprisingly find with an example that this whole stable set, although can fully exploit stable information, is not the optimal one to transfer.

Domain Generalization

Rethinking Adam: A Twofold Exponential Moving Average Approach

1 code implementation22 Jun 2021 Yizhou Wang, Yue Kang, Can Qin, Huan Wang, Yi Xu, Yulun Zhang, Yun Fu

The intuition is that gradient with momentum contains more accurate directional information and therefore its second moment estimation is a more favorable option for learning rate scaling than that of the raw gradient.

Towards Distraction-Robust Active Visual Tracking

no code implementations18 Jun 2021 Fangwei Zhong, Peng Sun, Wenhan Luo, Tingyun Yan, Yizhou Wang

In active visual tracking, it is notoriously difficult when distracting objects appear, as distractors often mislead the tracker by occluding the target or bringing a confusing appearance.

Visual Tracking

Towards Unified Surgical Skill Assessment

no code implementations CVPR 2021 Daochang Liu, Qiyue Li, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li

In this paper, a unified multi-path framework for automatic surgical skill assessment is proposed, which takes care of multiple composing aspects of surgical skills, including surgical tool usage, intraoperative event pattern, and other skill proxies.

Act Like a Radiologist: Towards Reliable Multi-view Correspondence Reasoning for Mammogram Mass Detection

no code implementations21 May 2021 Yuhang Liu, Fandong Zhang, Chaoqi Chen, Siwen Wang, Yizhou Wang, Yizhou Yu

In this paper, we propose an Anatomy-aware Graph convolutional Network (AGN), which is tailored for mammogram mass detection and endows existing detection methods with multi-view reasoning ability.

Anatomy Decision Making +2

Split and Connect: A Universal Tracklet Booster for Multi-Object Tracking

no code implementations6 May 2021 Gaoang Wang, Yizhou Wang, Renshu Gu, Weijie Hu, Jenq-Neng Hwang

To address such common challenges in most of the existing trackers, in this paper, a tracklet booster algorithm is proposed, which can be built upon any other tracker.

Multi-Object Tracking

Multi-Target Multi-Camera Tracking of Vehicles using Metadata-Aided Re-ID and Trajectory-Based Camera Link Model

no code implementations3 May 2021 Hung-Min Hsu, Jiarui Cai, Yizhou Wang, Jenq-Neng Hwang, Kwang-Ju Kim

In this paper, we propose a novel framework for multi-target multi-camera tracking (MTMCT) of vehicles based on metadata-aided re-identification (MA-ReID) and the trajectory-based camera link model (TCLM).

Clustering

Causal Hidden Markov Model for Time Series Disease Forecasting

no code implementations CVPR 2021 Jing Li, Botong Wu, Xinwei Sun, Yizhou Wang

We propose a causal hidden Markov model to achieve robust prediction of irreversible disease at an early stage, which is safety-critical and vital for medical treatment in early stages.

Time Series Time Series Analysis

Context Modeling in 3D Human Pose Estimation: A Unified Perspective

1 code implementation CVPR 2021 Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Hai Ci, Yizhou Wang

By comparing the two methods, we found that the end-to-end training scheme in GNN and the limb length constraints in PSM are two complementary factors to improve results.

3D Human Pose Estimation

Forecasting Irreversible Disease via Progression Learning

no code implementations CVPR 2021 Botong Wu, Sijie Ren, Jing Li, Xinwei Sun, Shiming Li, Yizhou Wang

In order to account for the degree of progression of the disease, we propose a temporal generative model to accurately generate the future image and compare it with the current one to get a residual image.

Disease Prediction

Revisiting 3D Context Modeling with Supervised Pre-training for Universal Lesion Detection in CT Slices

1 code implementation16 Dec 2020 Shu Zhang, Jincheng Xu, Yu-Chun Chen, Jiechao Ma, Zihao Li, Yizhou Wang, Yizhou Yu

We demonstrate that with the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset (3. 48% absolute improvement in the sensitivity of FPs@0. 5), significantly surpassing the baseline method by up to 6. 06% (in MAP@0. 5) which adopts 2D convolution for 3D context modeling.

Computed Tomography (CT) Lesion Detection +2

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

1 code implementation ICCV 2021 Rongchang Xie, Chunyu Wang, Wenjun Zeng, Yizhou Wang

The state-of-the-art methods are consistency-based which learn about unlabeled images by encouraging the model to give consistent predictions for images under different augmentations.

Pose Estimation Semi-Supervised Human Pose Estimation

Learning Multi-Agent Coordination for Enhancing Target Coverage in Directional Sensor Networks

1 code implementation NeurIPS 2020 Jing Xu, Fangwei Zhong, Yizhou Wang

Maximum target coverage by adjusting the orientation of distributed sensors is an important problem in directional sensor networks (DSNs).

Bilateral Asymmetry Guided Counterfactual Generating Network for Mammogram Classification

no code implementations30 Sep 2020 Chu-ran Wang, Jing Li, Fandong Zhang, Xinwei Sun, Hao Dong, Yizhou Yu, Yizhou Wang

Mammogram benign or malignant classification with only image-level labels is challenging due to the absence of lesion annotations.

Classification counterfactual +1

Surgical Skill Assessment on In-Vivo Clinical Data via the Clearness of Operating Field

no code implementations27 Aug 2020 Daochang Liu, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li

Then an objective and automated framework based on neural network is proposed to predict surgical skills through the proxy of COF.

Unsupervised Surgical Instrument Segmentation via Anchor Generation and Semantic Diffusion

1 code implementation27 Aug 2020 Daochang Liu, Yuhui Wei, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li

In the experiments on the binary instrument segmentation task of the 2017 MICCAI EndoVis Robotic Instrument Segmentation Challenge dataset, the proposed method achieves 0. 71 IoU and 0. 81 Dice score without using a single manual annotation, which is promising to show the potential of unsupervised learning for surgical tool segmentation.

Feature Correlation Segmentation

Traffic-Aware Multi-Camera Tracking of Vehicles Based on ReID and Camera Link Model

no code implementations22 Aug 2020 Hung-Min Hsu, Yizhou Wang, Jenq-Neng Hwang

In this paper, we propose an effective and reliable MTMCT framework for vehicles, which consists of a traffic-aware single camera tracking (TSCT) algorithm, a trajectory-based camera link model (CLM) for vehicle re-identification (ReID), and a hierarchical clustering algorithm to obtain the cross camera vehicle trajectories.

Clustering Vehicle Re-Identification

Leveraging both Lesion Features and Procedural Bias in Neuroimaging: An Dual-Task Split dynamics of inverse scale space

no code implementations17 Jul 2020 Xinwei Sun, Wenjing Han, Lingjing Hu, Yuan YAO, Yizhou Wang

Specifically, with a variable the splitting term, two estimators are introduced and split apart, i. e. one is for feature selection (the sparse estimator) and the other is for prediction (the dense estimator).

feature selection

Augmented Bi-path Network for Few-shot Learning

no code implementations15 Jul 2020 Baoming Yan, Chen Zhou, Bo Zhao, Kan Guo, Jiang Yang, Xiaobo Li, Ming Zhang, Yizhou Wang

Finally, the model learns to compare global and local features separately, i. e., in two paths, before merging the similarities.

Few-Shot Learning

TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning

no code implementations ECCV 2020 Xinwei Sun, Yilun Xu, Peng Cao, Yuqing Kong, Lingjing Hu, Shanghang Zhang, Yizhou Wang

In this paper, we propose a novel information-theoretic approach, namely \textbf{T}otal \textbf{C}orrelation \textbf{G}ain \textbf{M}aximization (TCGM), for semi-supervised multi-modal learning, which is endowed with promising properties: (i) it can utilize effectively the information across different modalities of unlabeled data points to facilitate training classifiers of each modality (ii) it has theoretical guarantee to identify Bayesian classifiers, i. e., the ground truth posteriors of all modalities.

Disease Prediction Emotion Recognition +1

Context-Aware Refinement Network Incorporating Structural Connectivity Prior for Brain Midline Delineation

1 code implementation10 Jul 2020 Shen Wang, Kongming Liang, Yiming Li, Yizhou Yu, Yizhou Wang

Nevertheless, there are still great challenges with brain midline delineation, such as the largely deformed midline caused by the mass effect and the possible morphological failure that the predicted midline is not a connected curve.

IA-MOT: Instance-Aware Multi-Object Tracking with Motion Consistency

no code implementations24 Jun 2020 Jiarui Cai, Yizhou Wang, Haotian Zhang, Hung-Min Hsu, Chengqian Ma, Jenq-Neng Hwang

Meanwhile, the spatial attention, which focuses on the foreground within the bounding boxes, is generated from the given instance masks and applied to the extracted embedding features.

Multi-Object Tracking Multiple Object Tracking +1

RODNet: Radar Object Detection Using Cross-Modal Supervision

1 code implementation3 Mar 2020 Yizhou Wang, Zhongyu Jiang, Xiangyu Gao, Jenq-Neng Hwang, Guanbin Xing, Hui Liu

Radar is usually more robust than the camera in severe driving scenarios, e. g., weak/strong lighting and bad weather.

Autonomous Driving Object +3

Segmentation-based Method combined with Dynamic Programming for Brain Midline Delineation

no code implementations27 Feb 2020 Shen Wang, Kongming Liang, Chengwei Pan, Chuyang Ye, Xiuli Li, Feng Liu, Yizhou Yu, Yizhou Wang

The midline related pathological image features are crucial for evaluating the severity of brain compression caused by stroke or traumatic brain injury (TBI).

Decision Making

Self-Directed Online Machine Learning for Topology Optimization

1 code implementation4 Feb 2020 Changyu Deng, Yizhou Wang, Can Qin, Yun Fu, Wei Lu

A small number of training data is generated dynamically based on the DNN's prediction of the optimum.

BIG-bench Machine Learning Stochastic Optimization

Pose-Assisted Multi-Camera Collaboration for Active Object Tracking

no code implementations15 Jan 2020 Jing Li, Jing Xu, Fangwei Zhong, Xiangyu Kong, Yu Qiao, Yizhou Wang

In the system, each camera is equipped with two controllers and a switcher: The vision-based controller tracks targets based on observed images.

Object Object Tracking

L_DMI: A Novel Information-theoretic Loss Function for Training Deep Nets Robust to Label Noise

no code implementations NeurIPS 2019 Yilun Xu, Peng Cao, Yuqing Kong, Yizhou Wang

To the best of our knowledge, L_DMI is the first loss function that is provably robust to instance-independent label noise, regardless of noise pattern, and it can be applied to any existing classification neural networks straightforwardly without any auxiliary information.

Ranked #35 on Image Classification on Clothing1M (using extra training data)

Learning with noisy labels

L_DMI: An Information-theoretic Noise-robust Loss Function

2 code implementations8 Sep 2019 Yilun Xu, Peng Cao, Yuqing Kong, Yizhou Wang

\emph{To the best of our knowledge, $\mathcal{L}_{DMI}$ is the first loss function that is provably robust to instance-independent label noise, regardless of noise pattern, and it can be applied to any existing classification neural networks straightforwardly without any auxiliary information}.

Learning with noisy labels

Multi-level Domain Adaptive learning for Cross-Domain Detection

no code implementations26 Jul 2019 Rongchang Xie, Fei Yu, Jiachao Wang, Yizhou Wang, Li Zhang

In recent years, object detection has shown impressive results using supervised deep learning, but it remains challenging in a cross-domain environment.

Object object-detection +1

Max-MIG: an Information Theoretic Approach for Joint Learning from Crowds

1 code implementation ICLR 2019 Peng Cao, Yilun Xu, Yuqing Kong, Yizhou Wang

Furthermore, we devise an accurate data-crowds forecaster that employs both the data and the crowdsourced labels to forecast the ground truth.

AD-VAT: An Asymmetric Dueling mechanism for learning Visual Active Tracking

no code implementations ICLR 2019 Fangwei Zhong, Peng Sun, Wenhan Luo, Tingyun Yan, Yizhou Wang

In AD-VAT, both the tracker and the target are approximated by end-to-end neural networks, and are trained via RL in a dueling/competitive manner: i. e., the tracker intends to lockup the target, while the target tries to escape from the tracker.

SSoC: Learning Spontaneous and Self-Organizing Communication for Multi-Agent Collaboration

no code implementations ICLR 2019 Xiangyu Kong, Jing Li, Bo Xin, Yizhou Wang

By treating the communication behaviour as an explicit action, SSoC learns to organize communication in an effective and efficient way.

Multi-agent Reinforcement Learning

$S^{2}$-LBI: Stochastic Split Linearized Bregman Iterations for Parsimonious Deep Learning

no code implementations24 Apr 2019 Yanwei Fu, Donghao Li, Xinwei Sun, Shun Zhang, Yizhou Wang, Yuan YAO

This paper proposes a novel Stochastic Split Linearized Bregman Iteration ($S^{2}$-LBI) algorithm to efficiently train the deep network.

Computational Efficiency Model Selection

Multi-Agent Tensor Fusion for Contextual Trajectory Prediction

1 code implementation CVPR 2019 Tianyang Zhao, Yifei Xu, Mathew Monfort, Wongun Choi, Chris Baker, Yibiao Zhao, Yizhou Wang, Ying Nian Wu

Specifically, the model encodes multiple agents' past trajectories and the scene context into a Multi-Agent Tensor, then applies convolutional fusion to capture multiagent interactions while retaining the spatial structure of agents and the scene context.

Autonomous Driving Trajectory Prediction

Efficient Model-Free Reinforcement Learning Using Gaussian Process

no code implementations11 Dec 2018 Ying Fan, Letian Chen, Yizhou Wang

Efficient Reinforcement Learning usually takes advantage of demonstration or good exploration strategy.

reinforcement-learning Reinforcement Learning (RL)

CRAVES: Controlling Robotic Arm with a Vision-based Economic System

1 code implementation CVPR 2019 Yiming Zuo, Weichao Qiu, Lingxi Xie, Fangwei Zhong, Yizhou Wang, Alan L. Yuille

We also construct a vision-based control system for task accomplishment, for which we train a reinforcement learning agent in a virtual environment and apply it to the real-world.

3D Pose Estimation Domain Adaptation

Exploit the Connectivity: Multi-Object Tracking with TrackletNet

1 code implementation18 Nov 2018 Gaoang Wang, Yizhou Wang, Haotian Zhang, Renshu Gu, Jenq-Neng Hwang

Multi-object tracking (MOT) is an important and practical task related to both surveillance systems and moving camera applications, such as autonomous driving and robotic vision.

Autonomous Driving Multi-Object Tracking +1

HAPPIER: Hierarchical Polyphonic Music Generative RNN

no code implementations27 Sep 2018 Tianyang Zhao, Xiaoxuan Ma, Honglin Ma, Yizhou Wang

Generating polyphonic music with coherent global structure is a major challenge for automatic composition algorithms.

Video Object Segmentation by Learning Location-Sensitive Embeddings

no code implementations ECCV 2018 Hai Ci, Chunyu Wang, Yizhou Wang

We address the problem of video object segmentation which outputs the masks of a target object throughout a video given only a bounding box in the first frame.

Object Semantic Segmentation +2

End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning

no code implementations10 Aug 2018 Wenhan Luo, Peng Sun, Fangwei Zhong, Wei Liu, Tong Zhang, Yizhou Wang

We further propose an environment augmentation technique and a customized reward function, which are crucial for successful training.

Object Object Tracking +1

Integrating Feature and Image Pyramid: A Lung Nodule Detector Learned in Curriculum Fashion

no code implementations21 Jul 2018 Benyuan Sun, Zhen Zhou, Fandong Zhang, Xiuli Li, Yizhou Wang

Meanwhile, our sampling strategy halves the training time of the proposal network on LUNA16.

MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shot and Zero-shot Learning

no code implementations ICML 2018 Bo Zhao, Xinwei Sun, Yanwei Fu, Yuan YAO, Yizhou Wang

To solve this task, $L_{1}$ regularization is widely used for the pursuit of feature selection and avoiding overfitting, and yet the sparse estimation of features in $L_{1}$ regularization may cause the underfitting of training data.

feature selection Zero-Shot Learning

A Large-scale Attribute Dataset for Zero-shot Learning

1 code implementation12 Apr 2018 Bo Zhao, Yanwei Fu, Rui Liang, Jia-Hong Wu, Yonggang Wang, Yizhou Wang

In classical ZSL algorithms, attributes are introduced as the intermediate semantic representation to realize the knowledge transfer from seen classes to unseen classes.

Attribute Transfer Learning +1

Joint Learning for Pulmonary Nodule Segmentation, Attributes and Malignancy Prediction

no code implementations10 Feb 2018 Botong Wu, Zhen Zhou, Jianwei Wang, Yizhou Wang

Refer to the literature of lung nodule classification, many studies adopt Convolutional Neural Networks (CNN) to directly predict the malignancy of lung nodules with original thoracic Computed Tomography (CT) and nodule location.

Attribute Computed Tomography (CT) +4

RAN4IQA: Restorative Adversarial Nets for No-Reference Image Quality Assessment

no code implementations14 Dec 2017 Hongyu Ren, Diqi Chen, Yizhou Wang

The evaluator predicts perceptual score by extracting feature representations from the distorted and restored patches to measure GoR.

No-Reference Image Quality Assessment NR-IQA +1

Zero-shot Learning via Shared-Reconstruction-Graph Pursuit

no code implementations20 Nov 2017 Bo Zhao, Xinwei Sun, Yuan YAO, Yizhou Wang

With the learned SRG, each unseen class prototype (cluster center) in the image feature space can be synthesized by the linear combination of other class prototypes, so that testing instances can be classified based on the distance to these synthesized prototypes.

Clustering Generalized Zero-Shot Learning +1

End-to-end Active Object Tracking via Reinforcement Learning

no code implementations ICML 2018 Wenhan Luo, Peng Sun, Fangwei Zhong, Wei Liu, Tong Zhang, Yizhou Wang

We study active object tracking, where a tracker takes as input the visual observation (i. e., frame sequence) and produces the camera control signal (e. g., move forward, turn left, etc.).

Object Object Tracking +2

Collaborative Deep Reinforcement Learning for Joint Object Search

no code implementations CVPR 2017 Xiangyu Kong, Bo Xin, Yizhou Wang, Gang Hua

We examine the problem of joint top-down active search of multiple objects under interaction, e. g., person riding a bicycle, cups held by the table, etc..

Active Object Localization Object +5

An Attention-Driven Approach of No-Reference Image Quality Assessment

no code implementations12 Dec 2016 Diqi Chen, Yizhou Wang, Tianfu Wu, Wen Gao

The model learning is implemented by a reinforcement strategy, in which the rewards of both tasks guide the learning of the optimal sampling policy to acquire the "task-informative" image regions so that the predictions can be made accurately and efficiently (in terms of the sampling steps).

Multi-Task Learning No-Reference Image Quality Assessment +2

Zero-Shot Learning posed as a Missing Data Problem

no code implementations2 Dec 2016 Bo Zhao, Botong Wu, Tianfu Wu, Yizhou Wang

This paper presents a method of zero-shot learning (ZSL) which poses ZSL as the missing data problem, rather than the missing label problem.

Zero-Shot Learning

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

4 code implementations2 Jun 2016 Yunzhu Li, Benyuan Sun, Tianfu Wu, Yizhou Wang

The proposed method addresses two issues in adapting state- of-the-art generic object detection ConvNets (e. g., faster R-CNN) for face detection: (i) One is to eliminate the heuristic design of prede- fined anchor boxes in the region proposals network (RPN) by exploit- ing a 3D mean face model.

Face Detection Face Model +3

Maximal Sparsity with Deep Networks?

no code implementations NeurIPS 2016 Bo Xin, Yizhou Wang, Wen Gao, David Wipf

The iterations of many sparse estimation algorithms are comprised of a fixed linear filter cascaded with a thresholding nonlinearity, which collectively resemble a typical neural network layer.

A Novel Method to Study Bottom-up Visual Saliency and its Neural Mechanism

no code implementations13 Apr 2016 Cheng Chen, Xilin Zhang, Yizhou Wang, Fang Fang

In this study, we propose a novel method to measure bottom-up saliency maps of natural images.

Exploiting Object Similarity in 3D Reconstruction

no code implementations ICCV 2015 Chen Zhou, Fatma Guney, Yizhou Wang, Andreas Geiger

Despite recent progress, reconstructing outdoor scenes in 3D from movable platforms remains a highly difficult endeavour.

3D Reconstruction Object

Stable Feature Selection from Brain sMRI

no code implementations25 Mar 2015 Bo Xin, Lingjing Hu, Yizhou Wang, Wen Gao

Neuroimage analysis usually involves learning thousands or even millions of variables using only a limited number of samples.

feature selection

Robust Subjective Visual Property Prediction from Crowdsourced Pairwise Labels

no code implementations25 Jan 2015 Yanwei Fu, Timothy M. Hospedales, Tao Xiang, Jiechao Xiong, Shaogang Gong, Yizhou Wang, Yuan YAO

In this paper, we propose a more principled way to identify annotation outliers by formulating the subjective visual property prediction task as a unified robust learning to rank problem, tackling both the outlier detection and learning to rank jointly.

Attribute Learning-To-Rank +2

Representing Data by a Mixture of Activated Simplices

no code implementations12 Dec 2014 Chunyu Wang, John Flynn, Yizhou Wang, Alan L. Yuille

We show that under this restriction, building a model with simplices amounts to constructing a convex hull inside the sphere whose boundary facets is close to the data.

Predictive Encoding of Contextual Relationships for Perceptual Inference, Interpolation and Prediction

no code implementations14 Nov 2014 Ming-Min Zhao, Chengxu Zhuang, Yizhou Wang, Tai Sing Lee

We propose a new neurally-inspired model that can learn to encode the global relationship context of visual events across time and space and to use the contextual information to modulate the analysis by synthesis process in a predictive coding framework.

Robust Estimation of 3D Human Poses from a Single Image

no code implementations CVPR 2014 Chunyu Wang, Yizhou Wang, Zhouchen Lin, Alan L. Yuille, Wen Gao

We address the challenges in three ways: (i) We represent a 3D pose as a linear combination of a sparse set of bases learned from 3D human skeletons.

3D Human Pose Estimation 3D Pose Estimation +2

Weakly Supervised Learning for Attribute Localization in Outdoor Scenes

no code implementations CVPR 2013 Shuo Wang, Jungseock Joo, Yizhou Wang, Song-Chun Zhu

We evaluate the proposed method by (i) showing the improvement of attribute recognition accuracy; and (ii) comparing the average precision of localizing attributes to the scene parts.

Attribute Weakly-supervised Learning

What Object Motion Reveals about Shape with Unknown BRDF and Lighting

no code implementations CVPR 2013 Manmohan Chandraker, Dikpal Reddy, Yizhou Wang, Ravi Ramamoorthi

Under orthographic projection, we prove that three differential motions suffice to yield an invariant that relates shape to image derivatives, regardless of BRDF and illumination.

Surface Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.