1 code implementation • 28 May 2020 • He Huang, Martin Pouls, Anne Meyer, Markus Pauly
The computational results show that the addition of this routing data can be beneficial to the model performance.
1 code implementation • 13 Apr 2023 • Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg
TDT models for Speech Recognition achieve better accuracy and up to 2. 82X faster inference than conventional Transducers.
Intent Classification Intent Classification and Slot Filling +3
1 code implementation • 13 Oct 2023 • Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg
We present a novel Speech Augmented Language Model (SALM) with {\em multitask} and {\em in-context} learning capabilities.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • NeurIPS 2023 • Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yuecong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, Lihua Xie
Extensive experiments have been conducted to compare the sensing capacity of each or several modalities in terms of multiple tasks.
1 code implementation • CVPR 2019 • He Huang, Changhu Wang, Philip S. Yu, Chang-Dong Wang
Most previous models try to learn a fixed one-directional mapping between visual and semantic space, while some recently proposed generative methods try to generate image features for unseen classes so that the zero-shot learning problem becomes a traditional fully-supervised classification problem.
3 code implementations • CVPR 2021 • Li Xu, He Huang, Jun Liu
In this paper, we create a novel dataset, SUTD-TrafficQA (Traffic Question Answering), which takes the form of video QA based on the collected 10, 080 in-the-wild videos and annotated 62, 535 QA pairs, for benchmarking the cognitive capability of causal inference and event understanding models in complex traffic scenarios.
Ranked #2 on Video Question Answering on SUTD-TrafficQA
1 code implementation • 17 Jan 2020 • Yang Liu, Anbu Huang, Yun Luo, He Huang, Youzhi Liu, YuanYuan Chen, Lican Feng, Tianjian Chen, Han Yu, Qiang Yang
Federated learning (FL) is a promising approach to resolve this challenge.
1 code implementation • 28 Sep 2020 • He Huang, Shunta Saito, Yuta Kikuchi, Eiichi Matsumoto, Wei Tang, Philip S. Yu
Motivated by the fact that detecting these rare relations can be critical in real-world applications, this paper introduces a novel integrated framework of classification and ranking to resolve the class imbalance problem in scene graph parsing.
1 code implementation • IEEE Conference on Computer Communications 2021 • Guoju Gao, He Huang, Mingjun Xiao, Jie Wu, Yu-E Sun, Sheng Zhang
The multi-armed bandit (MAB) model has been deeply studied to solve many online learning problems, such as rate allocation in communication networks, Ad recommendation in social networks, etc.
1 code implementation • 29 Aug 2018 • He Huang, Bokai Cao, Philip S. Yu, Chang-Dong Wang, Alex D. Leow
Mood disorders are common and associated with significant morbidity and mortality.
Human-Computer Interaction Computers and Society
no code implementations • 12 Mar 2018 • He Huang, Philip S. Yu, Changhu Wang
There has been a drastic growth of research in Generative Adversarial Nets (GANs) in the past few years.
no code implementations • 25 Aug 2018 • He Huang, Yujing Shen, Jiankai Sun, Cewu Lu
Indoor navigation aims at performing navigation within buildings.
no code implementations • NeurIPS 2016 • He Huang, Martin Paulus
The Bayesian approach indicates that the belief of environmental stationarity positively correlates with choice optimality, but not lose-shift rate (inverted U shape).
no code implementations • NeurIPS 2013 • Sheeraz Ahmad, He Huang, Angela J. Yu
Humans and animals readily utilize active sensing, or the use of self-motion, to focus sensory and cognitive resources on the behaviorally most relevant stimuli and events in the environment.
no code implementations • 25 Dec 2018 • Damao Yang, Sihan Peng, He Huang, Hongliang Xue
We design a dispatch system to improve the peak service quality of video on demand (VOD).
no code implementations • 20 Aug 2019 • Zewen He, He Huang, Yudong Wu, Guan Huang, Wensheng Zhang
Scale variation remains a challenging problem for object detection.
no code implementations • 30 Jul 2020 • He Huang, Yuanwei Chen, Wei Tang, Wenhao Zheng, Qing-Guo Chen, Yao Hu, Philip Yu
On the other hand, there is a large semantic gap between seen and unseen classes in the existing multi-label classification datasets.
no code implementations • 10 Aug 2020 • He Huang, Su Hu, Chaowei Yuan
Cooperative Communications (CC) has been one of most critical communication technologies which plays a founding role on Internet of Everything in B5G/6G networks.
no code implementations • 18 Aug 2021 • Haoran Peng, He Huang, Li Xu, Tianjiao Li, Jun Liu, Hossein Rahmani, Qiuhong Ke, Zhicheng Guo, Cong Wu, Rongchang Li, Mang Ye, Jiahao Wang, Jiaxu Zhang, Yuanzhong Liu, Tao He, Fuwei Zhang, Xianbin Liu, Tao Lin
In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021.
no code implementations • 21 Mar 2022 • Wenbo Zhang, Guang Yang, He Huang, Weiji Yang, Xiaomei Xu, Yongkai Liu, Xiaobo Lai
Moreover, the serious voxel imbalance between the brain tumor and the background as well as the different sizes and locations of the brain tumor makes the segmentation of 3D images a challenging problem.
no code implementations • 22 Aug 2022 • Jianfei Yang, Yunjiao Zhou, He Huang, Han Zou, Lihua Xie
Avatar refers to a representative of a physical user in the virtual world that can engage in different activities and interact with other objects in metaverse.
no code implementations • 8 May 2023 • Dima Rekesh, Nithin Rao Koluguri, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Krishna Puvvada, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg
Conformer-based models have become the dominant end-to-end architecture for speech processing tasks.
Ranked #1 on Speech Recognition on LibriSpeech test-other
no code implementations • 13 Jul 2023 • He Huang, Jagadeesh Balam, Boris Ginsburg
We study speech intent classification and slot filling (SICSF) by proposing to use an encoder pretrained on speech recognition (ASR) to initialize an end-to-end (E2E) Conformer-Transformer model, which achieves the new state-of-the-art results on the SLURP dataset, with 90. 14% intent accuracy and 82. 27% SLURP-F1.
no code implementations • 21 Aug 2023 • Shuang Cui, Kai Han, Jing Tang, He Huang, Xueying Li, Aakas Zhiyuli, Hanxiao Li
Submodular maximization has found extensive applications in various domains within the field of artificial intelligence, including but not limited to machine learning, computer vision, and natural language processing.
no code implementations • 29 Sep 2023 • Yunjiao Zhou, Jianfei Yang, He Huang, Lihua Xie
The results demonstrate the effectiveness and robustness of AdaPose in eliminating domain shift, thereby facilitating the widespread application of WiFi-based pose estimation in smart cities.
no code implementations • 18 Oct 2023 • Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg
We present the NVIDIA NeMo team's multi-channel speech recognition system for the 7th CHiME Challenge Distant Automatic Speech Recognition (DASR) Task, focusing on the development of a multi-channel, multi-speaker speech recognition system tailored to transcribe speech from distributed microphones and microphone arrays.
no code implementations • 18 Oct 2023 • Tae Jin Park, He Huang, Coleman Hooper, Nithin Koluguri, Kunal Dhawan, Ante Jukic, Jagadeesh Balam, Boris Ginsburg
This capability offers a tailored training environment for developing neural models suited for speaker diarization and voice activity detection.
no code implementations • 2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS) 2023 • Guoju Gao, He Huang, Jie Wu, Sijie Huang, Yang Du
In this paper, we propose a transaction-based multi-agent MAB framework, where agents can trade their bandit experience with each other to improve their total individual rewards.