Search Results for author: Matthias Grundmann

Found 25 papers, 7 papers with code

MediaPipe: A Framework for Building Perception Pipelines

2 code implementations14 Jun 2019 Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, Wan-Teh Chang, Wei Hua, Manfred Georg, Matthias Grundmann

A developer can use MediaPipe to build prototypes by combining existing perception components, to advance them to polished cross-platform applications and measure system performance and resource consumption on target platforms.

Distributed, Parallel, and Cluster Computing

BlazePose: On-device Real-time Body Pose tracking

7 code implementations17 Jun 2020 Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, Matthias Grundmann

We present BlazePose, a lightweight convolutional neural network architecture for human pose estimation that is tailored for real-time inference on mobile devices.

2D Human Pose Estimation 3D Human Pose Estimation +4

Attention Mesh: High-fidelity Face Mesh Prediction in Real-time

1 code implementation19 Jun 2020 Ivan Grishchenko, Artsiom Ablavatski, Yury Kartynnik, Karthik Raveendran, Matthias Grundmann

We present Attention Mesh, a lightweight architecture for 3D face mesh prediction that uses attention to semantically meaningful regions.

Vocal Bursts Intensity Prediction

BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

10 code implementations11 Jul 2019 Valentin Bazarevsky, Yury Kartynnik, Andrey Vakunov, Karthik Raveendran, Matthias Grundmann

We present BlazeFace, a lightweight and well-performing face detector tailored for mobile GPU inference.

Face Detection

MediaPipe Hands: On-device Real-time Hand Tracking

4 code implementations18 Jun 2020 Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, Matthias Grundmann

We present a real-time on-device hand tracking pipeline that predicts hand skeleton from single RGB camera for AR/VR applications.

Finding Temporally Consistent Occlusion Boundaries in Videos using Geometric Context

no code implementations25 Oct 2015 S. Hussain Raza, Ahmad Humayun, Matthias Grundmann, David Anderson, Irfan Essa

Our proposed framework provides an efficient approach for finding temporally consistent occlusion boundaries in video by utilizing causality, redundancy in videos, and semantic layout of the scene.

Geometric Context from Videos

no code implementations CVPR 2013 S. Hussain Raza, Matthias Grundmann, Irfan Essa

We present a novel algorithm for estimating the broad 3D geometric structure of outdoor video scenes.

Segmentation Video Segmentation +1

On-Device Neural Net Inference with Mobile GPUs

no code implementations3 Jul 2019 Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Yury Pisarchyk, Mogan Shieh, Fabio Riccardi, Raman Sarokin, Andrei Kulik, Matthias Grundmann

On-device inference of machine learning models for mobile phones is desirable due to its lower latency and increased privacy.

On the Estimation of the Number of Unreachable Peers in the Bitcoin P2P Network by Observation of Peer Announcements

no code implementations25 Feb 2021 Matthias Grundmann, Hedwig Amberg, Hannes Hartenstein

Thus, the number of unreachable peers can only be estimated based on some indicators.

Cryptography and Security Networking and Internet Architecture

On-device Real-time Hand Gesture Recognition

no code implementations29 Oct 2021 George Sung, Kanstantsin Sokal, Esha Uboweja, Valentin Bazarevsky, Jonathan Baccash, Eduard Gabriel Bazavan, Chuo-Ling Chang, Matthias Grundmann

We present an on-device real-time hand gesture recognition (HGR) system, which detects a set of predefined static gestures from a single RGB camera.

Hand Gesture Recognition Hand-Gesture Recognition

BlazePose GHUM Holistic: Real-time 3D Human Landmarks and Pose Estimation

no code implementations23 Jun 2022 Ivan Grishchenko, Valentin Bazarevsky, Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, Richard Yee, Karthik Raveendran, Matsvei Zhdanovich, Matthias Grundmann, Cristian Sminchisescu

We present BlazePose GHUM Holistic, a lightweight neural network pipeline for 3D human body landmarks and pose estimation, specifically tailored to real-time on-device inference.

3D Human Pose Estimation

Guided Speech Enhancement Network

no code implementations13 Mar 2023 Yang Yang, Shao-Fu Shih, Hakan Erdogan, Jamie Menjay Lin, Chehung Lee, Yunpeng Li, George Sung, Matthias Grundmann

Multi-microphone speech enhancement problem is often decomposed into two decoupled steps: a beamformer that provides spatial filtering and a single-channel speech enhancement model that cleans up the beamformer output.

Denoising Speech Enhancement

Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

no code implementations ICCV 2023 Yang Zhao, Tingbo Hou, Yu-Chuan Su, Xuhui Jia. Yandong Li, Matthias Grundmann

An authentic face restoration system is becoming increasingly demanding in many computer vision applications, e. g., image enhancement, video communication, and taking portrait.

Blind Face Restoration Denoising +2

Blendshapes GHUM: Real-time Monocular Facial Blendshape Prediction

no code implementations11 Sep 2023 Ivan Grishchenko, Geng Yan, Eduard Gabriel Bazavan, Andrei Zanfir, Nikolai Chinaev, Karthik Raveendran, Matthias Grundmann, Cristian Sminchisescu

We present Blendshapes GHUM, an on-device ML pipeline that predicts 52 facial blendshape coefficients at 30+ FPS on modern mobile phones, from a single monocular RGB image and enables facial motion capture applications like virtual avatars.

On-device Real-time Custom Hand Gesture Recognition

no code implementations19 Sep 2023 Esha Uboweja, David Tian, Qifei Wang, Yi-Chun Kuo, Joe Zou, Lu Wang, George Sung, Matthias Grundmann

Our framework provides a pre-trained single-hand embedding model that can be fine-tuned for custom gesture recognition.

Hand Gesture Recognition Hand-Gesture Recognition

StreamVC: Real-Time Low-Latency Voice Conversion

no code implementations5 Jan 2024 Yang Yang, Yury Kartynnik, Yunpeng Li, Jiuqiang Tang, Xing Li, George Sung, Matthias Grundmann

We present StreamVC, a streaming voice conversion solution that preserves the content and prosody of any source speech while matching the voice timbre from any target speech.

Speech Synthesis Voice Conversion

Binaural Angular Separation Network

no code implementations16 Jan 2024 Yang Yang, George Sung, Shao-Fu Shih, Hakan Erdogan, Chehung Lee, Matthias Grundmann

We propose a neural network model that can separate target speech sources from interfering sources at different angular regions using two microphones.

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

no code implementations13 Feb 2024 Fei Deng, Qifei Wang, Wei Wei, Matthias Grundmann, Tingbo Hou

However, in the vision domain, existing RL-based reward finetuning methods are limited by their instability in large-scale training, rendering them incapable of generalizing to complex, unseen prompts.

Denoising Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.