no code implementations • 7 Jan 2025 • Jingquan Wang, Harry Zhang, Khailanii Slaton, Shu Wang, Radu Serban, Jinlong Wu, Dan Negrut
Recently, the integration of advanced simulation technologies with artificial intelligence (AI) is revolutionizing science and engineering research.
no code implementations • 11 Dec 2024 • Harry Zhang, Luca Carlone
We introduce CUPS, a novel method for learning sequence-to-sequence 3D human shapes and poses from RGB videos with uncertainty quantification.
no code implementations • 2 Dec 2024 • Jingnan Shi, Rajat Talak, Harry Zhang, David Jin, Luca Carlone
Our first contribution is to introduce CRISP, a category-agnostic object pose and shape estimation pipeline.
1 code implementation • 4 Oct 2024 • Aaron Young, Nevindu M. Batagoda, Harry Zhang, Akshat Dave, Adithya Pediredla, Dan Negrut, Ramesh Raskar
We present a novel approach that leverages Non-Line-of-Sight (NLOS) sensing using single-photon LiDAR to improve visibility and enhance autonomous navigation.
1 code implementation • 21 Aug 2024 • Jingquan Wang, Harry Zhang, Huzaifa Mustafa Unjhawala, Peter Negrut, Shu Wang, Khailanii Slaton, Radu Serban, Jin-Long Wu, Dan Negrut
Given a collection of S-LLMs, this benchmark enables the ranking of the S-LLMs based on their ability to produce high-quality DTs.
no code implementations • 20 Aug 2024 • Hyun Jin Park, Dhruuv Agarwal, Neng Chen, Rentao Sun, Kurt Partridge, Justin Chen, Harry Zhang, Pai Zhu, Jacob Bartel, Kyle Kastner, Gary Wang, Andrew Rosenberg, Quan Wang
The keyword spotting (KWS) problem requires large amounts of real speech training data to achieve high accuracy across diverse populations.
no code implementations • 26 Jul 2024 • Hyun Jin Park, Dhruuv Agarwal, Neng Chen, Rentao Sun, Kurt Partridge, Justin Chen, Harry Zhang, Pai Zhu, Jacob Bartel, Kyle Kastner, Gary Wang, Andrew Rosenberg, Quan Wang
This paper explores the use of TTS synthesized training data for KWS (keyword spotting) task while minimizing development cost and time.
no code implementations • 27 May 2024 • Harry Zhang, Luca Carlone
This process results in a differentiable conformal predictor that is trained end2end with the 3D pose estimator.
1 code implementation • 25 May 2024 • Harry Zhang
The new stability-augmented framework consists of a neural-network-based learner that learns to construct a Lyapunov function, and a model-based RL agent to consistently complete the tasks while satisfying user-specified constraints given only sub-optimal demonstrations and sparse-cost feedback.
Model-based Reinforcement Learning
Model Predictive Control
+2
no code implementations • 14 May 2024 • David Jin, Harry Zhang, Kai Chang
We perform detailed theoretical analysis of an expectation-maximization-based algorithm recently proposed in for solving a variation of the 3D registration problem, named multi-model 3D registration.
no code implementations • 14 May 2024 • Priya Sundaresan, Aditya Ganapathi, Harry Zhang, Shivin Devgon
We investigate the problem of pixelwise correspondence for deformable objects, namely cloth and rope, by comparing both classical and learning-based methods.
no code implementations • 16 Feb 2024 • David Jin, Sushrut Karmalkar, Harry Zhang, Luca Carlone
We investigate a variation of the 3D registration problem, named multi-model 3D registration.
no code implementations • 29 Sep 2023 • Lillian Zhou, Yuxin Ding, Mingqing Chen, Harry Zhang, Rohit Prabhavalkar, Dhruv Guliani, Giovanni Motta, Rajiv Mathews
Automatic speech recognition (ASR) models are typically trained on large datasets of transcribed speech.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
2 code implementations • 21 Sep 2023 • Bo-Hsun Chen, Peter Negrut, Thomas Liang, Nevindu Batagoda, Harry Zhang, Dan Negrut
For Task 2, one can produce as much data as desired to train and test AI algorithms that are anticipated to work in lunar conditions.
no code implementations • 24 Aug 2023 • Yupu Yao, ShangQi Deng, ZiHan Cao, Harry Zhang, Liang-Jian Deng
One underlying cause is that traditional diffusion models approximate Gaussian noise distribution by utilizing predictive noise, without fully accounting for the impact of inherent information within the input itself.
no code implementations • 25 May 2023 • Sitian Shen, Zilin Zhu, Linqian Fan, Harry Zhang, Xinxiao wu
Large pre-trained models have had a significant impact on computer vision by enabling multi-modal learning, where the CLIP model has achieved impressive results in image classification, object detection, and semantic segmentation.
1 code implementation • 17 Nov 2022 • Chuer Pan, Brian Okorn, Harry Zhang, Ben Eisner, David Held
We conjecture that this relationship is a generalizable notion of a manipulation task that can transfer to new objects in the same category; examples include the relationship between the pose of a pan relative to an oven or the pose of a mug relative to a mug rack.
no code implementations • 9 May 2022 • Ben Eisner, Harry Zhang, David Held
We propose a vision-based system that learns to predict the potential motions of the parts of a variety of articulated objects to guide downstream motion planning of the system to articulate the objects.
no code implementations • 7 Oct 2021 • Dhruv Guliani, Lillian Zhou, Changwan Ryu, Tien-Ju Yang, Harry Zhang, Yonghui Xiao, Francoise Beaufays, Giovanni Motta
Federated learning can be used to train machine learning models on the edge on local data that never leave devices, providing privacy by default.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 29 May 2021 • Shivin Devgon, Jeffrey Ichnowski, Ashwin Balakrishna, Harry Zhang, Ken Goldberg
We formulate a self-supervised objective for this problem and train a deep neural network to estimate the 3D rotation as parameterized by a quaternion, between these current and desired depth images.
no code implementations • 10 Nov 2020 • Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Jonathan Wang, Huang Huang, Ken Goldberg
The framework finds a 3D apex point for the robot arm, which, together with a task-specific trajectory function, defines an arcing motion that dynamically manipulates the cable to perform tasks with varying obstacle and target locations.
no code implementations • 18 Sep 2020 • Yahav Avigal, Samuel Paradis, Harry Zhang
Recent consumer demand for home robots has accelerated performance of robotic grasping.
no code implementations • 14 Dec 2019 • Khe Chai Sim, Françoise Beaufays, Arnaud Benard, Dhruv Guliani, Andreas Kabel, Nikhil Khare, Tamar Lucassen, Petr Zadrazil, Harry Zhang, Leif Johnson, Giovanni Motta, Lillian Zhou
With speech input, if the user corrects only the names, the name recall rate improves to 64. 4%.