Search Results for author: Yang Xiao

Found 78 papers, 39 papers with code

Pixel-Pair Occlusion Relationship Map (P2ORM): Formulation, Inference & Application

no code implementations ECCV 2020 Xuchong Qiu, Yang Xiao, Chaohui Wang, Renaud Marlet

Inference & Application","We formalize concepts around geometric occlusion in 2D images (i. e., ignoring semantics), and propose a novel unified formulation of both occlusion boundaries and occlusion orientations via a pixel-pair occlusion relation.

Monocular Depth Estimation

Buffer is All You Need: Defending Federated Learning against Backdoor Attacks under Non-iids via Buffering

no code implementations30 Mar 2025 Xingyu Lyu, Ning Wang, Yang Xiao, Shixiong Li, Tao Li, Danjue Chen, Yimin Chen

FLBuff is inspired by our insight that non-iids can be modeled as omni-directional expansion in representation space while backdoor attacks as uni-directional.

All Contrastive Learning +1

RGB-Phase Speckle: Cross-Scene Stereo 3D Reconstruction via Wrapped Pre-Normalization

no code implementations8 Mar 2025 Kai Yang, Zijian Bai, Yang Xiao, Xinyu Li, Xiaohan Shi

3D reconstruction garners increasing attention alongside the advancement of high-level image applications, where dense stereo matching (DSM) serves as a pivotal technique.

3D Reconstruction Stereo Matching

Correcting Noisy Multilabel Predictions: Modeling Label Noise through Latent Space Shifts

no code implementations20 Feb 2025 Weipeng Huang, Qin Li, Yang Xiao, Cheng Qiao, Tie Cai, Junwei Liao, Neil J. Hurley, Guangyuan Piao

Our model posits that label noise arises from a stochastic shift in the latent variable, providing a more robust and beneficial means for noisy learning.

LIMO: Less is More for Reasoning

3 code implementations5 Feb 2025 Yixin Ye, Zhen Huang, Yang Xiao, Ethan Chern, Shijie Xia, PengFei Liu

While conventional wisdom suggests that sophisticated reasoning tasks demand extensive training data (>100, 000 examples), we demonstrate that complex mathematical reasoning abilities can be effectively elicited with surprisingly few examples.

Math Mathematical Reasoning +2

Towards Distributed Backdoor Attacks with Network Detection in Decentralized Federated Learning

no code implementations25 Jan 2025 Bohan Liu, Yang Xiao, Ruimeng Ye, Zinan Ling, Xiaolong Ma, Bo Hui

In this paper, we experimentally demonstrate that, while directly applying DBA to decentralized FL, the attack success rate depends on the distribution of attackers in the network architecture.

Federated Learning

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection

1 code implementation2 Nov 2024 Han Yin, Yang Xiao, Jisheng Bai, Rohan Kumar Das

Sound Event Detection (SED) is challenging in noisy environments where overlapping sounds obscure target events.

 Ranked #1 on Sound Event Detection on WildDESED (using extra training data)

Audio Source Separation Event Detection +1

Weak-to-Strong Generalization beyond Accuracy: a Pilot Study in Safety, Toxicity, and Legal Reasoning

1 code implementation16 Oct 2024 Ruimeng Ye, Yang Xiao, Bo Hui

We remark that existing works investigate the phenomenon of weak-to-strong generation in analogous setup (i. e., binary classification), rather than practical alignment-relevant tasks (e. g., safety).

Binary Classification Legal Reasoning

Dark Experience for Incremental Keyword Spotting

no code implementations12 Sep 2024 Tianyi Peng, Yang Xiao

Spoken keyword spotting (KWS) is crucial for identifying keywords within audio inputs and is widely used in applications like Apple Siri and Google Home, particularly on edge devices.

Continual Learning Keyword Spotting

TF-Mamba: A Time-Frequency Network for Sound Source Localization

no code implementations8 Sep 2024 Yang Xiao, Rohan Kumar Das

We consider the Mamba-based model to analyze spatial features from speech signals by fusing both time and frequency features, and we develop an SSL system called TF-Mamba.

Mamba Sound Source Localization +1

NGP-RT: Fusing Multi-Level Hash Features with Lightweight Attention for Real-Time Novel View Synthesis

no code implementations15 Jul 2024 Yubin Hu, Xiaoyang Guo, Yang Xiao, Jingwei Huang, Yong-Jin Liu

Although it achieves fast training speed, there is still a lot of room for improvement in its rendering speed due to the per-point MLP executions for implicit multi-level feature aggregation, especially for real-time applications.

NeRF Novel View Synthesis

BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning

no code implementations12 Jul 2024 Ning Wang, Shanghao Shi, Yang Xiao, Yimin Chen, Y. Thomas Hou, Wenjing Lou

Based on the intuition that clustering and subsequent backdoor detection can drastically benefit from knowing client data distributions, we propose a novel data distribution inference mechanism.

Anomaly Detection Backdoor Attack +3

Mixstyle based Domain Generalization for Sound Event Detection with Heterogeneous Training Data

no code implementations4 Jul 2024 Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

This work explores domain generalization (DG) for sound event detection (SED), advancing adaptability towards real-world scenarios.

Domain Generalization Event Detection +1

WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System

1 code implementation4 Jul 2024 Yang Xiao, Rohan Kumar Das

This work aims to advance sound event detection (SED) research by presenting a new large language model (LLM)-powered dataset namely wild domestic environment sound event detection (WildDESED).

Event Detection Language Modeling +3

UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event Detection

no code implementations4 Jul 2024 Yang Xiao, Rohan Kumar Das

This work explores class-incremental learning (CIL) for sound event detection (SED), advancing adaptability towards real-world scenarios.

class-incremental learning Class Incremental Learning +4

FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels

no code implementations29 Jun 2024 Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

Our proposed method shows superior macro-average pAUC and polyphonic SED score performance on the DCASE 2024 Challenge Task 4 validation dataset and public evaluation dataset.

Domain Generalization Event Detection +2

Advancing Airport Tower Command Recognition: Integrating Squeeze-and-Excitation and Broadcasted Residual Learning

no code implementations26 Jun 2024 Yuanxi Lin, Tonglin Zhou, Yang Xiao

These findings highlight the effectiveness of our model advancements in improving speech command recognition for aviation safety and efficiency in noisy, high-stakes environments.

Keyword Spotting

Towards a Client-Centered Assessment of LLM Therapists by Client Simulation

no code implementations18 Jun 2024 Jiashuo Wang, Yang Xiao, Yanran Li, Changhe Song, Chunpu Xu, Chenhao Tan, Wenjie Li

To this end, we adopt LLMs to simulate clients and propose ClientCAST, a client-centered approach to assessing LLM therapists by client simulation.

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

1 code implementation18 Jun 2024 Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang, Dahua Lin, Yu Qiao, PengFei Liu

We delve into the models' cognitive reasoning abilities, their performance across different modalities, and their outcomes in process-level evaluations, which are vital for tasks requiring complex reasoning with lengthy solutions.

Benchmarking scientific discovery

Post-hoc and manifold explanations analysis of facial expression data based on deep learning

1 code implementation29 Apr 2024 Yang Xiao

This study not only advances the application of AI technology in the field of psychology but also provides a new psychological theoretical understanding the information processing of the AI.

Deep Learning Memorization

A Survey on Long Video Generation: Challenges, Methods, and Prospects

no code implementations25 Mar 2024 Chengxuan Li, Di Huang, Zeyu Lu, Yang Xiao, Qingqi Pei, Lei Bai

Video generation is a rapidly advancing research area, garnering significant attention due to its broad range of applications.

Survey Video Generation

CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner

no code implementations15 Mar 2024 Tingbing Yan, Wenzheng Zeng, Yang Xiao, Xingyu Tong, Bo Tan, Zhiwen Fang, Zhiguo Cao, Joey Tianyi Zhou

Most existing one-shot skeleton-based action recognition focuses on raw low-level information (e. g., joint location), and may suffer from local information loss and low generalization ability.

Skeleton Based Action Recognition

A Survey of Lottery Ticket Hypothesis

no code implementations7 Mar 2024 Bohan Liu, Zijie Zhang, Peixiong He, Zhensen Wang, Yang Xiao, Ruimeng Ye, Yang Zhou, Wei-Shinn Ku, Bo Hui

The Lottery Ticket Hypothesis (LTH) states that a dense neural network model contains a highly sparse subnetwork (i. e., winning tickets) that can achieve even better performance than the original model when trained in isolation.

Survey

Dual Knowledge Distillation for Efficient Sound Event Detection

no code implementations5 Feb 2024 Yang Xiao, Rohan Kumar Das

To address this issue, we introduce a novel framework referred to as dual knowledge distillation for developing efficient SED systems in this work.

Ranked #2 on Sound Event Detection on DESED (using extra training data)

Event Detection Knowledge Distillation +1

How Far Are LLMs from Believable AI? A Benchmark for Evaluating the Believability of Human Behavior Simulation

1 code implementation28 Dec 2023 Yang Xiao, Yi Cheng, Jinlan Fu, Jiashuo Wang, Wenjie Li, PengFei Liu

In recent years, AI has demonstrated remarkable capabilities in simulating human behaviors, particularly those implemented with large language models (LLMs).

AI Agent Language Modelling

SAI3D: Segment Any Instance in 3D Scenes

no code implementations CVPR 2024 Yingda Yin, Yuzheng Liu, Yang Xiao, Daniel Cohen-Or, Jingwei Huang, Baoquan Chen

Advancements in 3D instance segmentation have traditionally been tethered to the availability of annotated datasets, limiting their application to a narrow spectrum of object categories.

3D Instance Segmentation Scene Parsing +2

Scale-MIA: A Scalable Model Inversion Attack against Secure Federated Learning via Latent Space Reconstruction

1 code implementation10 Nov 2023 Shanghao Shi, Ning Wang, Yang Xiao, Chaoyu Zhang, Yi Shi, Y. Thomas Hou, Wenjing Lou

The first step is to reconstruct the latent space representations (LSRs) from the aggregated model updates using a closed-form inversion mechanism, leveraging specially crafted linear layers.

Federated Learning

End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context

1 code implementation27 Oct 2023 Yiran Guan, Zhuoguang Chen, Wenzheng Zeng, Zhiguo Cao, Yang Xiao

In this letter, we propose a new method, Multi-Clue Gaze (MCGaze), to facilitate video gaze estimation via capturing spatial-temporal interaction context among head, face, and eye in an end-to-end learning way, which has not been well concerned yet.

Gaze Estimation

Multi-Ship Tracking by Robust Similarity metric

no code implementations8 Oct 2023 Hongyu Zhao, Gongming Wei, Yang Xiao, Xianglei Xing

The low frame rates and severe image shake caused by wave turbulence in ship datasets often result in minimal, or even zero, Intersection of Union (IoU) between the predicted and detected bounding boxes.

Multi-Object Tracking Object

A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image

1 code implementation CVPR 2023 Changlong Jiang, Yang Xiao, Cunlin Wu, Mingyang Zhang, Jinghong Zheng, Zhiguo Cao, Joey Tianyi Zhou

3D interacting hand pose estimation from a single RGB image is a challenging task, due to serious self-occlusion and inter-occlusion towards hands, confusing similar appearance patterns between 2 hands, ill-posed joint position mapping from 2D to 3D, etc.. To address these, we propose to extend A2J-the state-of-the-art depth-based 3D single hand pose estimation method-to RGB domain under interacting hand condition.

3D Interacting Hand Pose Estimation Hand Pose Estimation +1

PIZZA: A Powerful Image-only Zero-Shot Zero-CAD Approach to 6 DoF Tracking

1 code implementation15 Sep 2022 Van Nguyen Nguyen, Yuming Du, Yang Xiao, Michael Ramamonjisoa, Vincent Lepetit

Our results on challenging datasets are on par with previous works that require much more information (training images of the target objects, 3D models, and/or depth data).

3D geometry

Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling

no code implementations23 Aug 2022 Boshen Zhang, Yuxi Li, Yuanpeng Tu, Jinlong Peng, Yabiao Wang, Cunlin Wu, Yang Xiao, Cairong Zhao

Specifically, for the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus alleviating the effect from noisy samples incorrectly grouped into the clean set.

Denoising Image Classification

Continual Learning For On-Device Environmental Sound Classification

1 code implementation15 Jul 2022 Yang Xiao, Xubo Liu, James King, Arshdeep Singh, Eng Siong Chng, Mark D. Plumbley, Wenwu Wang

Experimental results on the DCASE 2019 Task 1 and ESC-50 dataset show that our proposed method outperforms baseline continual learning methods on classification accuracy and computational efficiency, indicating our method can efficiently and incrementally learn new classes without the catastrophic forgetting problem for on-device environmental sound classification.

Classification Computational Efficiency +3

CANShield: Deep Learning-Based Intrusion Detection Framework for Controller Area Networks at the Signal-Level

1 code implementation3 May 2022 Md Hasan Shahriar, Yang Xiao, Pablo Moriano, Wenjing Lou, Y. Thomas Hou

As ordinary injection attacks disrupt the typical timing properties of the CAN data stream, rule-based intrusion detection systems (IDS) can easily detect them.

Intrusion Detection Time Series +1

Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions

2 code implementations CVPR 2022 Van Nguyen Nguyen, Yinlin Hu, Yang Xiao, Mathieu Salzmann, Vincent Lepetit

It relies on a small set of training objects to learn local object representations, which allow us to locally match the input image to a set of "templates", rendered images of the CAD models for the new objects.

6D Pose Estimation 6D Pose Estimation using RGB +1

Rainbow Keywords: Efficient Incremental Learning for Online Spoken Keyword Spotting

1 code implementation30 Mar 2022 Yang Xiao, Nana Hou, Eng Siong Chng

Catastrophic forgetting is a thorny challenge when updating keyword spotting (KWS) models after deployment.

Data Augmentation Diversity +4

DataLab: A Platform for Data Analysis and Intervention

no code implementations ACL 2022 Yang Xiao, Jinlan Fu, Weizhe Yuan, Vijay Viswanathan, Zhoumianze Liu, Yixin Liu, Graham Neubig, PengFei Liu

Despite data's crucial role in machine learning, most existing tools and research tend to focus on systems on top of existing data rather than how to interpret and manipulate data.

Re-ranking for image retrieval and transductive few-shot classification

no code implementations NeurIPS 2021 Xi Shen, Yang Xiao, Shell Hu, Othman Sbai, Mathieu Aubry

In the problems of image retrieval and few-shot classification, the mainstream approaches focus on learning a better feature representation.

Classification Few-Shot Learning +3

UVO Challenge on Video-based Open-World Segmentation 2021: 1st Place Solution

2 code implementations22 Oct 2021 Yuming Du, Wen Guo, Yang Xiao, Vincent Lepetit

In this report, we introduce our (pretty straightforard) two-step "detect-then-match" video instance segmentation method.

Instance Segmentation Optical Flow Estimation +3

On the Robustness of Reading Comprehension Models to Entity Renaming

1 code implementation NAACL 2022 Jun Yan, Yang Xiao, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia, Xiang Ren

We study the robustness of machine reading comprehension (MRC) models to entity renaming -- do models make more wrong predictions when the same questions are asked about an entity whose name has been changed?

Continual Pretraining Machine Reading Comprehension

Robust Learning with Adaptive Sample Credibility Modeling

no code implementations29 Sep 2021 Boshen Zhang, Yuxi Li, Yuanpeng Tu, Yabiao Wang, Yang Xiao, Cai Rong Zhao, Chengjie Wang

For the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus to alleviate the effect from potential hard noisy samples in clean set.

Denoising

Learning to Better Segment Objects from Unseen Classes with Unlabeled Videos

no code implementations ICCV 2021 Yuming Du, Yang Xiao, Vincent Lepetit

Through extensive experiments, we show that our method can generate a high-quality training set which significantly boosts the performance of segmenting objects of unseen classes.

Object Open-World Instance Segmentation +3

ExplainaBoard: An Explainable Leaderboard for NLP

1 code implementation ACL 2021 PengFei Liu, Jinlan Fu, Yang Xiao, Weizhe Yuan, Shuaicheng Chang, Junqi Dai, Yixin Liu, Zihuiwen Ye, Zi-Yi Dou, Graham Neubig

In this paper, we present a new conceptualization and implementation of NLP evaluation: the ExplainaBoard, which in addition to inheriting the functionality of the standard leaderboard, also allows researchers to (i) diagnose strengths and weaknesses of a single system (e. g.~what is the best-performing system bad at?)

Machine Translation

Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources

1 code implementation17 Nov 2020 Sicheng Zhao, Yang Xiao, Jiang Guo, Xiangyu Yue, Jufeng Yang, Ravi Krishna, Pengfei Xu, Kurt Keutzer

C-CycleGAN transfers source samples at instance-level to an intermediate domain that is closer to the target domain with sentiment semantics preserved and without losing discriminative features.

Domain Adaptation Generative Adversarial Network +2

Partial FC: Training 10 Million Identities on a Single Machine

7 code implementations11 Oct 2020 Xiang An, Xuhan Zhu, Yang Xiao, Lan Wu, Ming Zhang, Yuan Gao, Bin Qin, Debing Zhang, Ying Fu

The experiment demonstrates no loss of accuracy when training with only 10\% randomly sampled classes for the softmax-based loss functions, compared with training with full classes using state-of-the-art models on mainstream benchmarks.

Face Identification Face Recognition +2

Pixel-Pair Occlusion Relationship Map(P2ORM): Formulation, Inference & Application

1 code implementation23 Jul 2020 Xuchong Qiu, Yang Xiao, Chaohui Wang, Renaud Marlet

The former provides a way to generate large-scale accurate occlusion datasets while, based on the latter, we propose a novel method for task-independent pixel-level occlusion relationship estimation from single images.

Monocular Depth Estimation Occlusion Estimation

ECML: An Ensemble Cascade Metric Learning Mechanism towards Face Verification

1 code implementation11 Jul 2020 Fu Xiong, Yang Xiao, Zhiguo Cao, Yancheng Wang, Joey Tianyi Zhou, Jianxi Wu

Embedding RMML into the proposed ECML mechanism, our metric learning paradigm (EC-RMML) can run in the one-pass learning manner.

Face Verification Fine-Grained Visual Recognition +1

Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

2 code implementations ICLR 2020 Shell Xu Hu, Pablo G. Moreno, Yang Xiao, Xi Shen, Guillaume Obozinski, Neil D. Lawrence, Andreas Damianou

The evidence lower bound of the marginal log-likelihood of empirical Bayes decomposes as a sum of local KL divergences between the variational posterior and the true posterior on the query set of each task.

Few-Shot Image Classification Meta-Learning +3

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

2 code implementations ICCV 2019 Fu Xiong, Boshen Zhang, Yang Xiao, Zhiguo Cao, Taidong Yu, Joey Tianyi Zhou, Junsong Yuan

For 3D hand and body pose estimation task in depth image, a novel anchor-based approach termed Anchor-to-Joint regression network (A2J) with the end-to-end learning ability is proposed.

3D Pose Estimation Decoder +2

Comparative evaluation of 2D feature correspondence selection algorithms

1 code implementation30 Apr 2019 Chen Zhao, Jiaqi Yang, Yang Xiao, Zhiguo Cao

Correspondence selection aiming at seeking correct feature correspondences from raw feature matches is pivotal for a number of feature-matching-based tasks.

Diversity

Towards Real-time Eyeblink Detection in The Wild:Dataset,Theory and Practices

no code implementations21 Feb 2019 Guilei Hu, Yang Xiao, Zhiguo Cao, Lubin Meng, Zhiwen Fang, Joey Tianyi Zhou, Junsong Yuan

Effective and real-time eyeblink detection is of wide-range applications, such as deception detection, drive fatigue detection, face anti-spoofing, etc.

Attribute +3

Towards Good Practices on Building Effective CNN Baseline Model for Person Re-identification

1 code implementation29 Jul 2018 Fu Xiong, Yang Xiao, Zhiguo Cao, Kaicheng Gong, Zhiwen Fang, Joey Tianyi Zhou

Person re-identification is indeed a challenging visual recognition task due to the critical issues of human pose variation, human body occlusion, camera view variation, etc.

Open-Ended Question Answering Person Re-Identification

Performance Evaluation of 3D Correspondence Grouping Algorithms

no code implementations6 Apr 2018 Jiaqi Yang, Ke Xian, Yang Xiao, Zhiguo Cao

This paper presents a thorough evaluation of several widely-used 3D correspondence grouping algorithms, motived by their significance in vision tasks relying on correct feature correspondences.

3D Object Recognition Point Cloud Registration +1

TasselNet: Counting maize tassels in the wild via local counts regression network

no code implementations7 Jul 2017 Hao Lu, Zhiguo Cao, Yang Xiao, Bohan Zhuang, Chunhua Shen

To our knowledge, this is the first time that a plant-related counting problem is considered using computer vision technologies under unconstrained field-based environment.

Plant Phenotyping regression

Predicting Restaurant Consumption Level through Social Media Footprints

no code implementations COLING 2016 Yang Xiao, Yu-An Wang, Hangyu Mao, Zhen Xiao

Accurate prediction of user attributes from social media is valuable for both social science analysis and consumer targeting.

Cannot find the paper you are looking for? You can Submit a new open access paper.