Search Results for author: Xu Zhao

Found 62 papers, 25 papers with code

Konwledge-Enabled Diagnosis Assistant Based on Obstetric EMRs and Knowledge Graph

no code implementations CCL 2020 Kunli Zhang, Xu Zhao, Lei Zhuang, Qi Xie, Hongying Zan

In this paper, we treat the diagnosis assistant as a multi-label classification task and propose a Knowledge-Enabled Diagnosis Assistant (KEDA) model for the obstetric diagnosis assistant.

Disease Prediction Multi-Label Classification +1

Learning Multi-scale Spatial-frequency Features for Image Denoising

no code implementations19 Jun 2025 Xu Zhao, Chen Zhao, Xiantao Hu, Hongliang Zhang, Ying Tai, Jian Yang

Recent advancements in multi-scale architectures have demonstrated exceptional performance in image denoising tasks.

Image Denoising

Trajectory Entropy: Modeling Game State Stability from Multimodality Trajectory Prediction

no code implementations6 Jun 2025 Yesheng Zhang, Wenjian Sun, YuHeng Chen, Qingwei Liu, Qi Lin, Rui Zhang, Xu Zhao

To tackle the issue, this paper proposes a metric, termed as Trajectory Entropy, to reveal the game status of agents within the level-k game framework.

Autonomous Driving Trajectory Prediction

DGOcc: Depth-aware Global Query-based Network for Monocular 3D Occupancy Prediction

no code implementations10 Apr 2025 Xu Zhao, Pengju Zhang, Bo Liu, Yihong Wu

Monocular 3D occupancy prediction, aiming to predict the occupancy and semantics within interesting regions of 3D scenes from only 2D images, has garnered increasing attention recently for its vital role in 3D scene understanding.

GPU Prediction +1

Bayesian Optimization for Controlled Image Editing via LLMs

no code implementations25 Feb 2025 Chengkun Cai, Haoliang Liu, Xu Zhao, Zhongyu Jiang, Tianfang Zhang, Zongkai Wu, Jenq-Neng Hwang, Serge Belongie, Lei LI

In the rapidly evolving field of image generation, achieving precise control over generated content and maintaining semantic consistency remain significant limitations, particularly concerning grounding techniques and the necessity for model fine-tuning.

Bayesian Optimization Image Generation

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

1 code implementation17 Feb 2025 Ailin Huang, Boyong Wu, Bruce Wang, Chao Yan, Chen Hu, Chengli Feng, Fei Tian, Feiyu Shen, Jingbei Li, Mingrui Chen, Peng Liu, Ruihang Miao, Wang You, Xi Chen, Xuerui Yang, Yechang Huang, Yuxiang Zhang, Zheng Gong, Zixin Zhang, HongYu Zhou, Jianjian Sun, Brian Li, Chengting Feng, Changyi Wan, Hanpeng Hu, Jianchang Wu, Jiangjie Zhen, Ranchen Ming, Song Yuan, Xuelin Zhang, Yu Zhou, Bingxin Li, Buyun Ma, Hongyuan Wang, Kang An, Wei Ji, Wen Li, Xuan Wen, Xiangwen Kong, Yuankai Ma, Yuanwei Liang, Yun Mou, Bahtiyar Ahmidi, Bin Wang, Bo Li, Changxin Miao, Chen Xu, Chenrun Wang, Dapeng Shi, Deshan Sun, Dingyuan Hu, Dula Sai, Enle Liu, Guanzhe Huang, Gulin Yan, Heng Wang, Haonan Jia, Haoyang Zhang, Jiahao Gong, Junjing Guo, Jiashuai Liu, Jiahong Liu, Jie Feng, Jie Wu, Jiaoren Wu, Jie Yang, Jinguo Wang, Jingyang Zhang, Junzhe Lin, Kaixiang Li, Lei Xia, Li Zhou, Liang Zhao, Longlong Gu, Mei Chen, Menglin Wu, Ming Li, Mingxiao Li, Mingliang Li, Mingyao Liang, Na Wang, Nie Hao, Qiling Wu, Qinyuan Tan, Ran Sun, Shuai Shuai, Shaoliang Pang, Shiliang Yang, Shuli Gao, Shanshan Yuan, SiQi Liu, Shihong Deng, Shilei Jiang, Sitong Liu, Tiancheng Cao, Tianyu Wang, Wenjin Deng, Wuxun Xie, Weipeng Ming, Wenqing He, Wen Sun, Xin Han, Xin Huang, Xiaomin Deng, Xiaojia Liu, Xin Wu, Xu Zhao, Yanan Wei, Yanbo Yu, Yang Cao, Yangguang Li, Yangzhen Ma, Yanming Xu, Yaoyu Wang, Yaqiang Shi, Yilei Wang, Yizhuang Zhou, Yinmin Zhong, Yang Zhang, Yaoben Wei, Yu Luo, Yuanwei Lu, Yuhe Yin, Yuchu Luo, Yuanhao Ding, Yuting Yan, Yaqi Dai, Yuxiang Yang, Zhe Xie, Zheng Ge, Zheng Sun, Zhewei Huang, Zhichao Chang, Zhisheng Guan, Zidong Yang, Zili Zhang, Binxing Jiao, Daxin Jiang, Heung-Yeung Shum, Jiansheng Chen, Jing Li, Shuchang Zhou, Xiangyu Zhang, Xinhao Zhang, Yibo Zhu

Based on our new StepEval-Audio-360 evaluation benchmark, Step-Audio achieves state-of-the-art performance in human evaluations, especially in terms of instruction following.

Instruction Following Voice Cloning

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

3 code implementations14 Feb 2025 Guoqing Ma, Haoyang Huang, Kun Yan, Liangyu Chen, Nan Duan, Shengming Yin, Changyi Wan, Ranchen Ming, Xiaoniu Song, Xing Chen, Yu Zhou, Deshan Sun, Deyu Zhou, Jian Zhou, Kaijun Tan, Kang An, Mei Chen, Wei Ji, Qiling Wu, Wen Sun, Xin Han, Yanan Wei, Zheng Ge, Aojie Li, Bin Wang, Bizhu Huang, Bo wang, Brian Li, Changxing Miao, Chen Xu, Chenfei Wu, Chenguang Yu, Dapeng Shi, Dingyuan Hu, Enle Liu, Gang Yu, Ge Yang, Guanzhe Huang, Gulin Yan, Haiyang Feng, Hao Nie, Haonan Jia, Hanpeng Hu, Hanqi Chen, Haolong Yan, Heng Wang, Hongcheng Guo, Huilin Xiong, Huixin Xiong, Jiahao Gong, Jianchang Wu, Jiaoren Wu, Jie Wu, Jie Yang, Jiashuai Liu, Jiashuo Li, Jingyang Zhang, Junjing Guo, Junzhe Lin, Kaixiang Li, Lei Liu, Lei Xia, Liang Zhao, Liguo Tan, Liwen Huang, Liying Shi, Ming Li, Mingliang Li, Muhua Cheng, Na Wang, Qiaohui Chen, Qinglin He, Qiuyan Liang, Quan Sun, Ran Sun, Rui Wang, Shaoliang Pang, Shiliang Yang, Sitong Liu, SiQi Liu, Shuli Gao, Tiancheng Cao, Tianyu Wang, Weipeng Ming, Wenqing He, Xu Zhao, Xuelin Zhang, Xianfang Zeng, Xiaojia Liu, Xuan Yang, Yaqi Dai, Yanbo Yu, Yang Li, Yineng Deng, Yingming Wang, Yilei Wang, Yuanwei Lu, Yu Chen, Yu Luo, Yuchu Luo, Yuhe Yin, Yuheng Feng, Yuxiang Yang, Zecheng Tang, Zekai Zhang, Zidong Yang, Binxing Jiao, Jiansheng Chen, Jing Li, Shuchang Zhou, Xiangyu Zhang, Xinhao Zhang, Yibo Zhu, Heung-Yeung Shum, Daxin Jiang

We present Step-Video-T2V, a state-of-the-art text-to-video pre-trained model with 30B parameters and the ability to generate videos up to 204 frames in length.

Video Generation Video Reconstruction

Systematic Outliers in Large Language Models

1 code implementation10 Feb 2025 Yongqi An, Xu Zhao, Tao Yu, Ming Tang, Jinqiao Wang

Outliers have been widely observed in Large Language Models (LLMs), significantly impacting model performance and posing challenges for model compression.

Model Compression

Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking

1 code implementation20 Dec 2024 Xiantao Hu, Ying Tai, Xu Zhao, Chen Zhao, Zhenyu Zhang, Jun Li, Bineng Zhong, Jian Yang

These temporal information tokens are used to guide the localization of the target in the next time state, establish long-range contextual relationships between video frames, and capture the temporal trajectory of the target.

Mamba Rgb-T Tracking +1

Transfer Learning in Vocal Education: Technical Evaluation of Limited Samples Describing Mezzo-soprano

no code implementations30 Oct 2024 Zhenyi Hou, Xu Zhao, Kejie Ye, Xinyu Sheng, Shanggerile Jiang, Jiajing Xia, YiTao Zhang, Chenxi Ban, Daijun Luo, Jiaxing Chen, Yan Zou, Yuchao Feng, Guangyu Fan, Xin Yuan

Vocal education in the music field is difficult to quantify due to the individual differences in singers' voices and the different quantitative criteria of singing techniques.

Deep Learning Transfer Learning

Disambiguating Monocular Reconstruction of 3D Clothed Human with Spatial-Temporal Transformer

no code implementations21 Oct 2024 Yong Deng, Baoxing Li, Xu Zhao

Simultaneously, to compensate for local ambiguity in images, a temporal transformer is utilized to extract temporal features from adjacent frames.

Monocular Reconstruction

The Role of Deductive and Inductive Reasoning in Large Language Models

no code implementations3 Oct 2024 Chengkun Cai, Xu Zhao, Haoliang Liu, Zhongyu Jiang, Tianfang Zhang, Zongkai Wu, Jenq-Neng Hwang, Lei LI

Large Language Models (LLMs) have achieved substantial progress in artificial intelligence, particularly in reasoning tasks.

GSM8K

MESA: Effective Matching Redundancy Reduction by Semantic Area Segmentation

1 code implementation1 Aug 2024 Yesheng Zhang, Shuhan Shen, Xu Zhao

To address the efficiency issue of MESA, we further propose DMESA as its dense counterpart, applying a dense matching framework.

Patch Matching

Matlab-based Epoch Extraction for Speaker Differentiation

no code implementations26 Jul 2024 Kunlun Li, Daniel Ferro, Xu Zhao, Abdul Jabbar Syed, Anil K Vuppala, Azeemuddin Syed

The number of epochs occurring at similar positions to the reference speaker will be counted as Delta, with larger Delta values indicating greater speaker similarity.

$T^2$ of Thoughts: Temperature Tree Elicits Reasoning in Large Language Models

no code implementations23 May 2024 Chengkun Cai, Xu Zhao, Yucheng Du, Haoliang Liu, Lei LI

Large Language Models (LLMs) have emerged as powerful tools in artificial intelligence, especially in complex decision-making scenarios, but their static problem-solving strategies often limit their adaptability to dynamic environments.

Decision Making Text Generation

Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs

no code implementations27 Apr 2024 Yiming Bao, Xu Zhao, Dahong Qian

On Total Capture dataset, the pose estimation error is significantly decreased compared to the baseline method.

3D Human Pose Estimation

An Embeddable Implicit IUVD Representation for Part-based 3D Human Surface Reconstruction

no code implementations30 Jan 2024 Baoxing Li, Yong Deng, Yehui Yang, Xu Zhao

Recent approaches have combined parametric body models (such as SMPL), which capture body pose and shape priors, with neural implicit functions that flexibly learn clothing details.

Surface Reconstruction

MESA: Matching Everything by Segmenting Anything

no code implementations CVPR 2024 Yesheng Zhang, Xu Zhao

However, the pervasive presence of matching redundancy between images gives rise to unnecessary and error-prone computations in these methods, imposing limitations on their accuracy.

Image Segmentation Pose Estimation +1

RSB-Pose: Robust Short-Baseline Binocular 3D Human Pose Estimation with Occlusion Handling

no code implementations24 Nov 2023 Xiaoyue Wan, Zhuo Chen, Yiming Bao, Xu Zhao

This perception is injected by the Pose Transformer network and learned through a pre-training task that recovers iterative masked joints.

3D Human Pose Estimation 3D Reconstruction +1

InstructCoder: Instruction Tuning Large Language Models for Code Editing

2 code implementations31 Oct 2023 Kaixin Li, Qisheng Hu, Xu Zhao, Hui Chen, Yuxi Xie, Tiedong Liu, Qizhe Xie, Junxian He

In this work, we explore the use of Large Language Models (LLMs) to edit code based on user instructions.

Disentangled Counterfactual Reasoning for Unbiased Sequential Recommendation

no code implementations5 Aug 2023 Yi Ren, Xu Zhao, Hongyan Tang, Shuai Li

In this paper, we propose a structural causal model-based method to address the popularity bias issue for sequential recommendation model learning.

counterfactual Counterfactual Reasoning +1

Fast Segment Anything

1 code implementation21 Jun 2023 Xu Zhao, Wenchao Ding, Yongqi An, Yinglong Du, Tao Yu, Min Li, Ming Tang, Jinqiao Wang

In this paper, we propose a speed-up alternative method for this fundamental task with comparable performance.

Edge Detection Image Segmentation +6

CodeInstruct: Empowering Language Models to Edit Code

1 code implementation Github 2023 Qisheng Hu*, Kaixin Li*, Xu Zhao, Yuxi Xie, Tiedong Liu, Hui Chen, Qizhe Xie, Junxian He

In this work, we explore the use of large language models (LLMs) to edit code based on user instructions, covering a broad range of implicit tasks such as comment insertion, code optimization, and code refactoring.

Diversity

Self-Evaluation Guided Beam Search for Reasoning

no code implementations NeurIPS 2023 Yuxi Xie, Kenji Kawaguchi, Yiran Zhao, Xu Zhao, Min-Yen Kan, Junxian He, Qizhe Xie

Stochastic beam search balances exploitation and exploration of the search space with temperature-controlled randomness.

Arithmetic Reasoning GSM8K +4

Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching

2 code implementations29 Apr 2023 Yesheng Zhang, Xu Zhao

This paper, thus, pays attention to the search space and proposes to set the initial search space for point matching as the matched image areas containing prominent semantic, named semantic area matches.

Pose Estimation

TorchBench: Benchmarking PyTorch with High API Surface Coverage

1 code implementation27 Apr 2023 Yueming Hao, Xu Zhao, Bin Bao, David Berard, Will Constable, Adnan Aziz, Xu Liu

TorchBench is able to comprehensively characterize the performance of the PyTorch software stack, guiding the performance optimization across models, PyTorch framework, and GPU libraries.

Benchmarking GPU +1

FreConv: Frequency Branch-and-Integration Convolutional Networks

no code implementations10 Apr 2023 Zhaowen Li, Xu Zhao, Peigeng Ding, Zongxin Gao, Yuting Yang, Ming Tang, Jinqiao Wang

In the high-frequency branch, a derivative-filter-like architecture is designed to extract the high-frequency information while a light extractor is employed in the low-frequency branch because the low-frequency information is usually redundant.

ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection

1 code implementation CVPR 2023 Yongqi An, Xu Zhao, Tao Yu, Haiyun Guo, Chaoyang Zhao, Ming Tang, Jinqiao Wang

However, previous unsupervised deep learning BGS algorithms perform poorly in sophisticated scenarios such as shadows or night lights, and they cannot detect objects outside the pre-defined categories.

Foreground Segmentation Object +2

Item Cold Start Recommendation via Adversarial Variational Auto-encoder Warm-up

no code implementations28 Feb 2023 Shenzheng Zhang, Qi Tan, Xinzhi Zheng, Yi Ren, Xu Zhao

The gap between the randomly initialized item ID embedding and the well-trained warm item ID embedding makes the cold items hard to suit the recommendation system, which is trained on the data of historical warm items.

News Recommendation

Slate-Aware Ranking for Recommendation

1 code implementation24 Feb 2023 Yi Ren, Xiao Han, Xu Zhao, Shenzheng Zhang, Yan Zhang

Therefore, the ranking stage is still essential for most applications to provide high-quality candidate set for the re-ranking stage.

Recommendation Systems Re-Ranking

View Consistency Aware Holistic Triangulation for 3D Human Pose Estimation

no code implementations22 Feb 2023 Xiaoyue Wan, Zhuo Chen, Xu Zhao

The rapid development of multi-view 3D human pose estimation (HPE) is attributed to the maturation of monocular 2D HPE and the geometry of 3D reconstruction.

3D Human Pose Estimation 3D Reconstruction +2

Balanced Audiovisual Dataset for Imbalance Analysis

1 code implementation14 Feb 2023 Wenke Xia, Xu Zhao, Xincheng Pang, Changqing Zhang, Di Hu

We surprisingly find that: the multimodal models with existing imbalance algorithms consistently perform worse than the unimodal one on specific subsets, in accordance with the modality bias.

Movement Enhancement toward Multi-Scale Video Feature Representation for Temporal Action Detection

no code implementations ICCV 2023 Zixuan Zhao, Dongqi Wang, Xu Zhao

First, the submergence of movement feature, i. e. the movement information in a snippet is covered by the scene information.

Action Detection

Does Deep Learning REALLY Outperform Non-deep Machine Learning for Clinical Prediction on Physiological Time Series?

no code implementations11 Nov 2022 Ke Liao, Wei Wang, Armagan Elibol, Lingzhong Meng, Xu Zhao, Nak Young Chong

In this paper, we systematically examine the performance of machine learning models for the clinical prediction task based on the EHR, especially physiological time series.

Deep Learning Prognosis +2

Transfering Low-Frequency Features for Domain Adaptation

no code implementations31 Aug 2022 Zhaowen Li, Xu Zhao, Chaoyang Zhao, Ming Tang, Jinqiao Wang

Previous unsupervised domain adaptation methods did not handle the cross-domain problem from the perspective of frequency for computer vision.

image-classification Image Classification +3

Improving Item Cold-start Recommendation via Model-agnostic Conditional Variational Autoencoder

1 code implementation27 May 2022 Xu Zhao, Yi Ren, Ying Du, Shenzheng Zhang, Nian Wang

This paper attempts to tackle the item cold-start problem by generating enhanced warmed-up ID embeddings for cold items with historical data and limited interaction records.

Decoder News Recommendation +1

ETAD: Training Action Detection End to End on a Laptop

1 code implementation14 May 2022 Shuming Liu, Mengmeng Xu, Chen Zhao, Xu Zhao, Bernard Ghanem

We propose to sequentially forward the snippet frame through the video encoder, and backward only a small necessary portion of gradients to update the encoder.

Action Detection GPU +1

Learning-Based Framework for Camera Calibration with Distortion Correction and High Precision Feature Detection

1 code implementation1 Feb 2022 Yesheng Zhang, Xu Zhao, Dahong Qian

Therefore, in this paper, we propose a hybrid camera calibration framework which combines learning-based approaches with traditional methods to handle these bottlenecks.

Camera Calibration distortion correction +1

Pruning-aware Sparse Regularization for Network Pruning

1 code implementation18 Jan 2022 Nanfei Jiang, Xu Zhao, Chaoyang Zhao, Yongqi An, Ming Tang, Jinqiao Wang

MaskSparsity imposes the fine-grained sparse regularization on the specific filters selected by a pruning mask, rather than all the filters of the model.

Network Pruning

Estimate Metabolite Taxonomy and Structure with a Fragment-Centered Database and Fragment Network

no code implementations11 Jan 2021 Hansen Zhao, Xu Zhao, Huan Yao, Jiaxin Feng, Sichun Zhang, Xinrong Zhang

Metabolite structure identification has become the major bottleneck of the mass spectrometry based metabolomics research.

Adaptive Tree Wasserstein Minimization for Hierarchical Generative Modeling

no code implementations1 Jan 2021 ZiHao Wang, Xu Zhao, Tam Le, Hao Wu, Yong Zhang, Makoto Yamada

In this work, we consider OT over tree metrics, which is more general than the sliced Wasserstein and includes the sliced Wasserstein as a special case, and we propose a fast minimization algorithm in $O(n)$ for the optimal Wasserstein-1 transport plan between two distributions in the tree structure.

Unsupervised Domain Adaptation

An End to End Network Architecture for Fundamental Matrix Estimation

no code implementations29 Oct 2020 Yesheng Zhang, Xu Zhao, Dahong Qian

In this paper, we present a novel end-to-end network architecture to estimate fundamental matrix directly from stereo images.

Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction

1 code implementation EMNLP 2020 Xu Zhao, ZiHao Wang, Hao Wu, Yong Zhang

In this paper, we propose a new semi-supervised BLI framework to encourage the interaction between the supervised signal and unsupervised alignment.

Bilingual Lexicon Induction Vocal Bursts Valence Prediction

Task Decoupled Knowledge Distillation For Lightweight Face Detectors

1 code implementation14 Oct 2020 Xiaoqing Liang, Xu Zhao, Chaoyang Zhao, Nanfei Jiang, Ming Tang, Jinqiao Wang

This method decouples the distillation task of face detection into two subtasks, i. e., the classification distillation subtask and the regression distillation subtask.

Face Detection Knowledge Distillation +1

A Relaxed Matching Procedure for Unsupervised BLI

no code implementations ACL 2020 Xu Zhao, ZiHao Wang, Hao Wu, Yong Zhang

Recently unsupervised Bilingual Lexicon Induction (BLI) without any parallel corpus has attracted much research interest.

Bilingual Lexicon Induction Translation

Parameter Sharing Decoder Pair for Auto Composing

no code implementations31 Oct 2019 Xu Zhao

Auto Composing is an active and appealing research area in the past few years, and lots of efforts have been put into inventing more robust models to solve this problem.

Decoder

Multi-Granularity Fusion Network for Proposal and Activity Localization: Submission to ActivityNet Challenge 2019 Task 1 and Task 2

no code implementations29 Jul 2019 Haisheng Su, Xu Zhao, Shuming Liu

This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2019 Task 1 (\textbf{temporal action proposal generation}) and Task 2 (\textbf{temporal action localization/detection}).

Diversity Re-Ranking +2

EdgeStereo: An Effective Multi-Task Learning Network for Stereo Matching and Edge Detection

no code implementations5 Mar 2019 Xiao Song, Xu Zhao, Liangji Fang, Hanwen Hu

EdgeStereo also achieves comparable generalization performance for disparity estimation because of the incorporation of edge cues.

Disparity Estimation Edge Detection +3

A Tangent Distance Preserving Dimensionality Reduction Algorithm

no code implementations4 Feb 2019 Xu Zhao, Zongli Jiang

TDPM uses tangent distance instead of geodesic distance, and then applies MDS to the tangent distance matrix to map the manifold into a low dimensional space in which we can get its nonlinear structure.

Dimensionality Reduction

Cascaded Pyramid Mining Network for Weakly Supervised Temporal Action Localization

no code implementations28 Oct 2018 Haisheng Su, Xu Zhao, Tianwei Lin

Weakly supervised temporal action localization, which aims at temporally locating action instances in untrimmed videos using only video-level class labels during training, is an important yet challenging problem in video analysis.

General Classification Video Classification

Discriminative Representation Combinations for Accurate Face Spoofing Detection

no code implementations27 Aug 2018 Xiao Song, Xu Zhao, Liangji Fang, Tianwei Lin

Secondly we utilize the SSD, which is a deep learning framework for detection, to excavate context cues and conduct end-to-end face presentation attack detection.

Face Presentation Attack Detection

BSN: Boundary Sensitive Network for Temporal Action Proposal Generation

17 code implementations ECCV 2018 Tianwei Lin, Xu Zhao, Haisheng Su, Chongjing Wang, Ming Yang

Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content.

Action Detection Temporal Action Proposal Generation

EdgeStereo: A Context Integrated Residual Pyramid Network for Stereo Matching

no code implementations14 Mar 2018 Xiao Song, Xu Zhao, Hanwen Hu, Liangji Fang

Recent convolutional neural networks, especially end-to-end disparity estimation models, achieve remarkable performance on stereo matching task.

Disparity Estimation Edge Detection +2

Face Spoofing Detection by Fusing Binocular Depth and Spatial Pyramid Coding Micro-Texture Features

no code implementations13 Mar 2018 Xiao Song, Xu Zhao, Tianwei Lin

The second one is a high-level micro-texture based feature called Spatial Pyramid Coding Micro-Texture (SPMT) feature.

Single Shot Temporal Action Detection

2 code implementations17 Oct 2017 Tianwei Lin, Xu Zhao, Zheng Shou

The main drawback of this framework is that the boundaries of action instance proposals have been fixed during the classification step.

Action Detection General Classification

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

3 code implementations ICCV 2017 Yousong Zhu, Chaoyang Zhao, Jinqiao Wang, Xu Zhao, Yi Wu, Hanqing Lu

To fully explore the local and global properties, in this paper, we propose a novel fully convolutional network, named as CoupleNet, to couple the global structure with local parts for object detection.

Object object-detection +3

Joint Background Reconstruction and Foreground Segmentation via A Two-stage Convolutional Neural Network

no code implementations24 Jul 2017 Xu Zhao, Yingying Chen, Ming Tang, Jinqiao Wang

In the first stage, a convolutional encoder-decoder sub-network is employed to reconstruct the background images and encode rich prior knowledge of background scenes.

Decoder Foreground Segmentation +1

Temporal Convolution Based Action Proposal: Submission to ActivityNet 2017

no code implementations21 Jul 2017 Tianwei Lin, Xu Zhao, Zheng Shou

Our approach achieves the state-of-the-art performances on both temporal action proposal task and temporal action localization task.

Action Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.