Search Results for author: He Huang

Found 28 papers, 10 papers with code

Travel Time Prediction using Tree-Based Ensembles

1 code implementation • 28 May 2020 • He Huang, Martin Pouls, Anne Meyer, Markus Pauly

The computational results show that the addition of this routing data can be beneficial to the model performance.

16,043

Paper
Code

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

1 code implementation • 13 Apr 2023 • Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg

TDT models for Speech Recognition achieve better accuracy and up to 2. 82X faster inference than conventional Transducers.

Ranked #1 on Speech Recognition on facebook/multilingual_librispeech german

Intent Classification Intent Classification and Slot Filling +3

10,034

Paper
Code

SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation

1 code implementation • 13 Oct 2023 • Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg

We present a novel Speech Augmented Language Model (SALM) with {\em multitask} and {\em in-context} learning capabilities.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

10,034

Paper
Code

MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing

1 code implementation • NeurIPS 2023 • Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yuecong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, Lihua Xie

Extensive experiments have been conducted to compare the sensing capacity of each or several modalities in terms of multiple tasks.

Action Recognition Pose Estimation

Paper
Code

Generative Dual Adversarial Network for Generalized Zero-shot Learning

1 code implementation • CVPR 2019 • He Huang, Changhu Wang, Philip S. Yu, Chang-Dong Wang

Most previous models try to learn a fixed one-directional mapping between visual and semantic space, while some recently proposed generative methods try to generate image features for unseen classes so that the zero-shot learning problem becomes a traditional fully-supervised classification problem.

Generalized Zero-Shot Learning Metric Learning

Paper
Code

SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events

3 code implementations • CVPR 2021 • Li Xu, He Huang, Jun Liu

In this paper, we create a novel dataset, SUTD-TrafficQA (Traffic Question Answering), which takes the form of video QA based on the collected 10, 080 in-the-wild videos and annotated 62, 535 QA pairs, for benchmarking the cognitive capability of causal inference and event understanding models in complex traffic scenarios.

Ranked #2 on Video Question Answering on SUTD-TrafficQA

Autonomous Vehicles Benchmarking +4

Paper
Code

FedVision: An Online Visual Object Detection Platform Powered by Federated Learning

1 code implementation • 17 Jan 2020 • Yang Liu, Anbu Huang, Yun Luo, He Huang, Youzhi Liu, YuanYuan Chen, Lican Feng, Tianjian Chen, Han Yu, Qiang Yang

Federated learning (FL) is a promising approach to resolve this challenge.

Federated Learning object-detection +1

Paper
Code

Addressing Class Imbalance in Scene Graph Parsing by Learning to Contrast and Score

1 code implementation • 28 Sep 2020 • He Huang, Shunta Saito, Yuta Kikuchi, Eiichi Matsumoto, Wei Tang, Philip S. Yu

Motivated by the fact that detecting these rare relations can be critical in real-world applications, this paper introduces a novel integrated framework of classification and ranking to resolve the class imbalance problem in scene graph parsing.

Paper
Code

Auction-Based Combinatorial Multi-Armed Bandit Mechanisms with Strategic Arms

1 code implementation • IEEE Conference on Computer Communications 2021 • Guoju Gao, He Huang, Mingjun Xiao, Jie Wu, Yu-E Sun, Sheng Zhang

The multi-armed bandit (MAB) model has been deeply studied to solve many online learning problems, such as rate allocation in communication networks, Ad recommendation in social networks, etc.

Computational Efficiency

Paper
Code

dpMood: Exploiting Local and Periodic Typing Dynamics for Personalized Mood Prediction

1 code implementation • 29 Aug 2018 • He Huang, Bokai Cao, Philip S. Yu, Chang-Dong Wang, Alex D. Leow

Mood disorders are common and associated with significant morbidity and mortality.

Human-Computer Interaction Computers and Society

Paper
Code

An Introduction to Image Synthesis with Generative Adversarial Nets

no code implementations • 12 Mar 2018 • He Huang, Philip S. Yu, Changhu Wang

There has been a drastic growth of research in Generative Adversarial Nets (GANs) in the past few years.

Image-to-Image Translation Translation

Paper
Add Code

NavigationNet: A Large-scale Interactive Indoor Navigation Dataset

no code implementations • 25 Aug 2018 • He Huang, Yujing Shen, Jiankai Sun, Cewu Lu

Indoor navigation aims at performing navigation within buildings.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Learning under uncertainty: a comparison between R-W and Bayesian approach

no code implementations • NeurIPS 2016 • He Huang, Martin Paulus

The Bayesian approach indicates that the belief of environmental stationarity positively correlates with choice optimality, but not lose-shift rate (inverted U shape).

Paper
Add Code

Context-sensitive active sensing in humans

no code implementations • NeurIPS 2013 • Sheeraz Ahmad, He Huang, Angela J. Yu

Humans and animals readily utilize active sensing, or the use of self-motion, to focus sensory and cognitive resources on the behaviorally most relevant stimuli and events in the environment.

Paper
Add Code

On-Demand Video Dispatch Networks: A Scalable End-to-End Learning Approach

no code implementations • 25 Dec 2018 • Damao Yang, Sihan Peng, He Huang, Hongliang Xue

We design a dispatch system to improve the peak service quality of video on demand (VOD).

Clustering

Paper
Add Code

Instance Scale Normalization for image understanding

no code implementations • 20 Aug 2019 • Zewen He, He Huang, Yudong Wu, Guan Huang, Wensheng Zhang

Scale variation remains a challenging problem for object detection.

Instance Segmentation Object +5

Paper
Add Code

Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge

no code implementations • 30 Jul 2020 • He Huang, Yuanwei Chen, Wei Tang, Wenhao Zheng, Qing-Guo Chen, Yao Hu, Philip Yu

On the other hand, there is a large semantic gap between seen and unseen classes in the existing multi-label classification datasets.

Classification General Classification +3

Paper
Add Code

Cooperative Communications for Internet of Everything in B5G/6G Hybrid and Ubiquitous Networks: Foundation, Further Optimization and Solutions

no code implementations • 10 Aug 2020 • He Huang, Su Hu, Chaowei Yuan

Cooperative Communications (CC) has been one of most critical communication technologies which plays a founding role on Internet of Everything in B5G/6G networks.

Paper
Add Code

The Multi-Modal Video Reasoning and Analyzing Competition

no code implementations • 18 Aug 2021 • Haoran Peng, He Huang, Li Xu, Tianjiao Li, Jun Liu, Hossein Rahmani, Qiuhong Ke, Zhicheng Guo, Cong Wu, Rongchang Li, Mang Ye, Jiahao Wang, Jiaxu Zhang, Yuanzhong Liu, Tao He, Fuwei Zhang, Xianbin Liu, Tao Lin

In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021.

Action Recognition Person Re-Identification +3

Paper
Add Code

ME-Net: Multi-Encoder Net Framework for Brain Tumor Segmentation

no code implementations • 21 Mar 2022 • Wenbo Zhang, Guang Yang, He Huang, Weiji Yang, Xiaomei Xu, Yongkai Liu, Xiaobo Lai

Moreover, the serious voxel imbalance between the brain tumor and the background as well as the different sizes and locations of the brain tumor makes the segmentation of 3D images a challenging problem.

Brain Tumor Segmentation Segmentation +1

Paper
Add Code

MetaFi: Device-Free Pose Estimation via Commodity WiFi for Metaverse Avatar Simulation

no code implementations • 22 Aug 2022 • Jianfei Yang, Yunjiao Zhou, He Huang, Han Zou, Lihua Xie

Avatar refers to a representative of a physical user in the virtual world that can engage in different activities and interact with other objects in metaverse.

Pose Estimation

Paper
Add Code

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

no code implementations • 8 May 2023 • Dima Rekesh, Nithin Rao Koluguri, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Krishna Puvvada, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

Conformer-based models have become the dominant end-to-end architecture for speech processing tasks.

Ranked #1 on Speech Recognition on LibriSpeech test-other

Automatic Speech Recognition speech-recognition +3

Paper
Add Code

Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling

no code implementations • 13 Jul 2023 • He Huang, Jagadeesh Balam, Boris Ginsburg

We study speech intent classification and slot filling (SICSF) by proposing to use an encoder pretrained on speech recognition (ASR) to initialize an end-to-end (E2E) Conformer-Transformer model, which achieves the new state-of-the-art results on the SLURP dataset, with 90. 14% intent accuracy and 82. 27% SLURP-F1.

intent-classification Intent Classification +7

Paper
Add Code

Practical Parallel Algorithms for Non-Monotone Submodular Maximization

no code implementations • 21 Aug 2023 • Shuang Cui, Kai Han, Jing Tang, He Huang, Xueying Li, Aakas Zhiyuli, Hanxiao Li

Submodular maximization has found extensive applications in various domains within the field of artificial intelligence, including but not limited to machine learning, computer vision, and natural language processing.

Paper
Add Code

AdaPose: Towards Cross-Site Device-Free Human Pose Estimation with Commodity WiFi

no code implementations • 29 Sep 2023 • Yunjiao Zhou, Jianfei Yang, He Huang, Lihua Xie

The results demonstrate the effectiveness and robustness of AdaPose in eliminating domain shift, thereby facilitating the widespread application of WiFi-based pose estimation in smart cities.

Domain Adaptation Pose Estimation

Paper
Add Code

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

no code implementations • 18 Oct 2023 • Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg

We present the NVIDIA NeMo team's multi-channel speech recognition system for the 7th CHiME Challenge Distant Automatic Speech Recognition (DASR) Task, focusing on the development of a multi-channel, multi-speaker speech recognition system tailored to transcribe speech from distributed microphones and microphone arrays.

Automatic Speech Recognition speaker-diarization +3

Paper
Add Code

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation

no code implementations • 18 Oct 2023 • Tae Jin Park, He Huang, Coleman Hooper, Nithin Koluguri, Kunal Dhawan, Ante Jukic, Jagadeesh Balam, Boris Ginsburg

This capability offers a tailored training environment for developing neural models suited for speaker diarization and voice activity detection.

Action Detection Activity Detection +3

Paper
Add Code

You Can Trade Your Experience in Distributed Multi-Agent Multi-Armed Bandits

no code implementations • 2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS) 2023 • Guoju Gao, He Huang, Jie Wu, Sijie Huang, Yang Du

In this paper, we propose a transaction-based multi-agent MAB framework, where agents can trade their bandit experience with each other to improve their total individual rewards.

Decision Making Multi-Armed Bandits

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.