Search Results for author: Esa Rahtu

Found 76 papers, 38 papers with code

DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing

1 code implementation • 26 Mar 2024 • Matias Turkulainen, Xuqian Ren, Iaroslav Melekhov, Otto Seiskari, Esa Rahtu, Juho Kannala

3D Gaussian splatting, a novel differentiable rendering technique, has achieved state-of-the-art novel view synthesis results with high rendering speeds and relatively low training times.

Depth Estimation Novel View Synthesis

169

Paper
Code

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

1 code implementation • 20 Mar 2024 • Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo, Pekka Rantalankila, Matias Turkulainen, Juho Kannala, Esa Rahtu, Arno Solin

High-quality scene reconstruction and novel view synthesis based on Gaussian Splatting (3DGS) typically require steady, high-quality photographs, often impractical to capture with handheld cameras.

Novel View Synthesis

Paper
Code

GS-Pose: Cascaded Framework for Generalizable Segmentation-based 6D Object Pose Estimation

no code implementations • 15 Mar 2024 • Dingding Cai, Janne Heikkilä, Esa Rahtu

At inference, GS-Pose operates sequentially by locating the object in the input image, estimating its initial 6D pose using a retrieval approach, and refining the pose with a render-and-compare method.

6D Pose Estimation using RGB Object +1

Paper
Add Code

Synchformer: Efficient Synchronization from Sparse Cues

2 code implementations • 29 Jan 2024 • Vladimir Iashin, Weidi Xie, Esa Rahtu, Andrew Zisserman

Our objective is audio-visual synchronization with a focus on 'in-the-wild' videos, such as those on YouTube, where synchronization cues can be sparse.

Audio-Visual Synchronization

Paper
Code

Bridging the gap between image coding for machines and humans

no code implementations • 19 Jan 2024 • Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed Rezazadegan Tavakoli, Emre Aksu, Miska M. Hannuksela, Esa Rahtu

Image coding for machines (ICM) aims at reducing the bitrate required to represent an image while minimizing the drop in machine vision analysis accuracy.

Paper
Add Code

NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines

no code implementations • 19 Jan 2024 • Jukka I. Ahonen, Nam Le, Honglei Zhang, Antti Hallapuro, Francesco Cricri, Hamed Rezazadegan Tavakoli, Miska M. Hannuksela, Esa Rahtu

To the best of our knowledge, this is the first research paper showing a hybrid video codec that outperforms VVC on multiple datasets and multiple machine vision tasks.

Paper
Add Code

MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis

no code implementations • 5 Nov 2023 • Xuqian Ren, Wenjia Wang, Dingding Cai, Tuuli Tuominen, Juho Kannala, Esa Rahtu

Metaverse technologies demand accurate, real-time, and immersive modeling on consumer-grade hardware for both non-human perception (e. g., drone/robot/autonomous car navigation) and immersive technologies like AR/VR, requiring both structural accuracy and photorealism.

3D Reconstruction Novel View Synthesis

Paper
Add Code

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

no code implementations • CVPR 2023 • Mayu Otani, Riku Togashi, Yu Sawai, Ryosuke Ishigami, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh

Human evaluation is critical for validating the performance of text-to-image generative models, as this highly cognitive process requires deep comprehension of text and images.

Text-to-Image Generation

Paper
Add Code

FinnWoodlands Dataset

1 code implementation • 3 Apr 2023 • Juan Lagos, Urho Lempiö, Esa Rahtu

Besides tree trunks, we also annotated "Obstacles" objects as instances as well as the semantic stuff classes "Lake", "Ground", and "Track".

Autonomous Driving Depth Completion +3

Paper
Code

MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose Estimation

no code implementations • 14 Feb 2023 • Dingding Cai, Janne Heikkilä, Esa Rahtu

Though massive amounts of synthetic RGB images are easy to obtain, the models trained on them suffer from noticeable performance degradation due to the synthetic-to-real domain gap.

6D Pose Estimation using RGB Domain Adaptation +1

Paper
Add Code

BS3D: Building-scale 3D Reconstruction from RGB-D Images

no code implementations • 3 Jan 2023 • Janne Mustaniemi, Juho Kannala, Esa Rahtu, Li Liu, Janne Heikkilä

Various datasets have been proposed for simultaneous localization and mapping (SLAM) and related problems.

3D Reconstruction Monocular Depth Estimation +1

Paper
Add Code

PanDepth: Joint Panoptic Segmentation and Depth Completion

1 code implementation • 29 Dec 2022 • Juan Lagos, Esa Rahtu

Understanding 3D environments semantically is pivotal in autonomous driving applications where multiple computer vision tasks are involved.

Autonomous Driving Depth Completion +3

Paper
Code

Supervised Fine-tuning Evaluation for Long-term Visual Place Recognition

no code implementations • 14 Nov 2022 • Farid Alijani, Esa Rahtu

In this paper, we present a comprehensive study on the utility of deep convolutional neural networks with two state-of-the-art pooling layers which are placed after convolutional layers and fine-tuned in an end-to-end manner for visual place recognition task in challenging conditions, including seasonal and illumination variations.

Visual Place Recognition

Paper
Add Code

Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors

2 code implementations • 13 Oct 2022 • Vladimir Iashin, Weidi Xie, Esa Rahtu, Andrew Zisserman

This contrasts with the case of synchronising videos of talking heads, where audio-visual correspondence is dense in both time and space.

Audio-Visual Synchronization

Paper
Code

SemSegDepth: A Combined Model for Semantic Segmentation and Depth Completion

1 code implementation • 1 Sep 2022 • Juan Pablo Lagos, Esa Rahtu

Holistic scene understanding is pivotal for the performance of autonomous machines.

Depth Completion Scene Understanding +2

Paper
Code

The Weighting Game: Evaluating Quality of Explainability Methods

1 code implementation • 12 Aug 2022 • Lassi Raatikainen, Esa Rahtu

The objective of this paper is to assess the quality of explanation heatmaps for image classification tasks.

Image Classification

Paper
Code

Cascaded and Generalizable Neural Radiance Fields for Fast View Synthesis

no code implementations • 9 Aug 2022 • Phong Nguyen-Ha, Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkila

Moreover, our method can leverage a denser set of reference images of a single scene to produce accurate novel views without relying on additional explicit representations and still maintains the high-speed rendering of the pre-trained model.

Neural Rendering Novel View Synthesis

Paper
Add Code

SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation

1 code implementation • 3 Aug 2022 • Dingding Cai, Janne Heikkilä, Esa Rahtu

The pose estimation is decomposed into three sub-tasks: a) object 3D rotation representation learning and matching; b) estimation of the 2D location of the object center; and c) scale-invariant distance estimation (the translation along the z-axis) via classification.

6D Pose Estimation using RGB Object +2

Paper
Code

Beyond Visual Field of View: Perceiving 3D Environment with Echoes and Vision

1 code implementation • 3 Jul 2022 • Lingyu Zhu, Esa Rahtu, Hang Zhao

This paper focuses on perceiving and navigating 3D environments using echoes and RGB image.

Depth Estimation Robot Navigation

Paper
Code

Online panoptic 3D reconstruction as a Linear Assignment Problem

1 code implementation • 1 Apr 2022 • Leevi Raivio, Esa Rahtu

Real-time holistic scene understanding would allow machines to interpret their surrounding in a much more detailed manner than is currently possible.

3D Reconstruction Image Segmentation +3

Paper
Code

AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

no code implementations • CVPR 2022 • Riku Togashi, Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkila, Tetsuya Sakai

First, it is rank-insensitive: It ignores the rank positions of successfully localised moments in the top-$K$ ranked list by treating the list as a set.

Moment Retrieval Retrieval

Paper
Add Code

Optimal Correction Cost for Object Detection Evaluation

1 code implementation • CVPR 2022 • Mayu Otani, Riku Togashi, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh

OC-cost computes the cost of correcting detections to ground truths as a measure of accuracy.

Object object-detection +2

Paper
Code

Fast Neural Architecture Search for Lightweight Dense Prediction Networks

no code implementations • 3 Mar 2022 • Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkila

We present LDP, a lightweight dense prediction neural architecture search (NAS) framework.

Image Super-Resolution Monocular Depth Estimation +3

Paper
Add Code

OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation

1 code implementation • CVPR 2022 • Dingding Cai, Janne Heikkilä, Esa Rahtu

This paper proposes a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask.

6D Pose Estimation using RGB Object +1

Paper
Code

Adaptation and Attention for Neural Video Coding

no code implementations • 16 Dec 2021 • Nannan Zou, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed R. Tavakoli, Jani Lainema, Emre Aksu, Miska Hannuksela, Esa Rahtu

In this work, we propose an end-to-end learned video codec that introduces several architectural novelties as well as training novelties, revolving around the concepts of adaptation and attention.

Image Compression Motion Estimation

Paper
Add Code

Taming Visually Guided Sound Generation

3 code implementations • 17 Oct 2021 • Vladimir Iashin, Esa Rahtu

In this work, we propose a single model capable of generating visually relevant, high-fidelity sounds prompted with a set of frames from open-domain videos in less time than it takes to play it on a single GPU.

Audio Generation

431

Paper
Code

Towards a Real-Time Facial Analysis System

1 code implementation • 21 Sep 2021 • Bishwo Adhikari, Xingyang Ni, Esa Rahtu, Heikki Huttunen

Facial analysis is an active research area in computer vision, with many practical applications.

object-detection Object Detection

Paper
Code

V-SlowFast Network for Efficient Visual Sound Separation

1 code implementation • 18 Sep 2021 • Lingyu Zhu, Esa Rahtu

The objective of this paper is to perform visual sound separation: i) we study visual sound separation on spectrograms of different temporal resolutions; ii) we propose a new light yet efficient three-stream framework V-SlowFast that operates on Visual frame, Slow spectrogram, and Fast spectrogram.

Paper
Code

Monocular Depth Estimation Primed by Salient Point Detection and Normalized Hessian Loss

no code implementations • 25 Aug 2021 • Lam Huynh, Matteo Pedone, Phong Nguyen, Jiri Matas, Esa Rahtu, Janne Heikkila

In addition, we introduce a normalized Hessian loss term invariant to scaling and shear along the depth direction, which is shown to substantially improve the accuracy.

Monocular Depth Estimation

Paper
Add Code

Lightweight Monocular Depth with a Novel Neural Architecture Search Method

no code implementations • 25 Aug 2021 • Lam Huynh, Phong Nguyen, Jiri Matas, Esa Rahtu, Janne Heikkila

This paper presents a novel neural architecture search method, called LiDNAS, for generating lightweight monocular depth estimation models.

Monocular Depth Estimation Neural Architecture Search

Paper
Add Code

Learned Image Coding for Machines: A Content-Adaptive Approach

no code implementations • 23 Aug 2021 • Nam Le, Honglei Zhang, Francesco Cricri, Ramin Ghaznavi-Youvalari, Hamed Rezazadegan Tavakoli, Esa Rahtu

One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption.

Data Compression Image Compression

Paper
Add Code

Image coding for machines: an end-to-end learned approach

no code implementations • 23 Aug 2021 • Nam Le, Honglei Zhang, Francesco Cricri, Ramin Ghaznavi-Youvalari, Esa Rahtu

Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images.

Instance Segmentation object-detection +2

Paper
Add Code

On the Importance of Encrypting Deep Features

1 code implementation • 16 Aug 2021 • Xingyang Ni, Heikki Huttunen, Esa Rahtu

On the other hand, it is advisable to encrypt feature vectors, especially for a machine learning model in production.

Person Re-Identification

Paper
Code

HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry

1 code implementation • 22 Jun 2021 • Otto Seiskari, Pekka Rantalankila, Juho Kannala, Jerry Ylilammi, Esa Rahtu, Arno Solin

We present HybVIO, a novel hybrid approach for combining filtering-based visual-inertial odometry (VIO) with optimization-based SLAM.

423

Paper
Code

FlipReID: Closing the Gap between Training and Inference in Person Re-Identification

1 code implementation • 12 May 2021 • Xingyang Ni, Esa Rahtu

More specifically, models using the FlipReID structure are trained on the original images and the flipped images simultaneously, and incorporating the flipping loss minimizes the mean squared error between feature vectors of corresponding image pairs.

Ranked #3 on Person Re-Identification on MSMT17

Person Re-Identification

Paper
Code

Sample selection for efficient image annotation

no code implementations • 10 May 2021 • Bishwo Adhikari, Esa Rahtu, Heikki Huttunen

Supervised object detection has been proven to be successful in many benchmark datasets achieving human-level performances.

object-detection Object Detection

Paper
Add Code

Selective Probabilistic Classifier Based on Hypothesis Testing

no code implementations • 9 May 2021 • Saeed Bakhshi Germi, Esa Rahtu, Heikki Huttunen

In this paper, we propose a simple yet effective method to deal with the violation of the Closed-World Assumption for a classifier.

Paper
Add Code

Visually Guided Sound Source Separation and Localization using Self-Supervised Motion Representations

1 code implementation • 17 Apr 2021 • Lingyu Zhu, Esa Rahtu

The objective of this paper is to perform audio-visual sound source separation, i. e.~to separate component audios from a mixture based on the videos of sound sources.

Optical Flow Estimation Visually Guided Sound Source Separation

Paper
Code

Single Source One Shot Reenactment using Weighted motion From Paired Feature Points

no code implementations • 7 Apr 2021 • Soumya Tripathy, Juho Kannala, Esa Rahtu

Image reenactment is a task where the target object in the source image imitates the motion represented in the driving image.

Face Reenactment Image Animation

Paper
Add Code

Boosting Monocular Depth Estimation with Lightweight 3D Point Fusion

no code implementations • ICCV 2021 • Lam Huynh, Phong Nguyen, Jiri Matas, Esa Rahtu, Janne Heikkila

In this paper, we propose enhancing monocular depth estimation by adding 3D points as depth guidance.

Depth Completion Monocular Depth Estimation

Paper
Add Code

RGBD-Net: Predicting color and depth images for novel views synthesis

no code implementations • 29 Nov 2020 • Phong Nguyen, Animesh Karnewar, Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkila

We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

Novel View Synthesis regression

Paper
Add Code

FACEGAN: Facial Attribute Controllable rEenactment GAN

no code implementations • 9 Nov 2020 • Soumya Tripathy, Juho Kannala, Esa Rahtu

However, if the identity differs, the driving facial structures leak to the output distorting the reenactment result.

Attribute Face Reenactment

Paper
Add Code

Uncovering Hidden Challenges in Query-Based Video Moment Retrieval

1 code implementation • 1 Sep 2020 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä

In this paper, we present a series of experiments assessing how well the benchmark results reflect the true progress in solving the moment retrieval task.

Moment Retrieval Retrieval +2

Paper
Code

Learning to Learn to Compress

no code implementations • 31 Jul 2020 • Nannan Zou, Honglei Zhang, Francesco Cricri, Hamed R. -Tavakoli, Jani Lainema, Miska Hannuksela, Emre Aksu, Esa Rahtu

In a second phase, the Model-Agnostic Meta-learning approach is adapted to the specific case of image compression, where the inner-loop performs latent tensor overfitting, and the outer loop updates both encoder and decoder neural networks based on the overfitting performance.

Image Compression Meta-Learning +1

Paper
Add Code

Leveraging Category Information for Single-Frame Visual Sound Source Separation

3 code implementations • 15 Jul 2020 • Lingyu Zhu, Esa Rahtu

Furthermore, our models are able to exploit the information of the sound source category in the separation process.

Optical Flow Estimation

Paper
Code

Visually Guided Sound Source Separation using Cascaded Opponent Filter Network

1 code implementation • 4 Jun 2020 • Lingyu Zhu, Esa Rahtu

A key element in COF is a novel opponent filter module that identifies and relocates residual components between sources.

Visually Guided Sound Source Separation

Paper
Code

A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer

2 code implementations • 17 May 2020 • Vladimir Iashin, Esa Rahtu

We show the effectiveness of the proposed model with audio and visual modalities on the dense video captioning task, yet the module is capable of digesting any two modalities in a sequence-to-sequence task.

Ranked #1 on Temporal Action Proposal Generation on ActivityNet Captions

Dense Video Captioning Temporal Action Proposal Generation

431

Paper
Code

End-to-End Learning for Video Frame Compression with Self-Attention

no code implementations • 20 Apr 2020 • Nannan Zou, Honglei Zhang, Francesco Cricri, Hamed R. -Tavakoli, Jani Lainema, Emre Aksu, Miska Hannuksela, Esa Rahtu

One of the core components of conventional (i. e., non-learned) video codecs consists of predicting a frame from a previously-decoded frame, by leveraging temporal correlations.

MS-SSIM Optical Flow Estimation +1

Paper
Add Code

Sequential View Synthesis with Transformer

no code implementations • 9 Apr 2020 • Phong Nguyen-Ha, Lam Huynh, Esa Rahtu, Janne Heikkila

This paper addresses the problem of novel view synthesis by means of neural rendering, where we are interested in predicting the novel view at an arbitrary camera pose based on a given set of input images from other viewpoints.

Neural Rendering Novel View Synthesis

Paper
Add Code

Guiding Monocular Depth Estimation Using Depth-Attention Volume

2 code implementations • ECCV 2020 • Lam Huynh, Phong Nguyen-Ha, Jiri Matas, Esa Rahtu, Janne Heikkila

Recovering the scene depth from a single image is an ill-posed problem that requires additional priors, often referred to as monocular depth cues, to disambiguate different 3D interpretations.

Monocular Depth Estimation

Paper
Code

Multi-modal Dense Video Captioning

4 code implementations • 17 Mar 2020 • Vladimir Iashin, Esa Rahtu

We apply automatic speech recognition (ASR) system to obtain a temporally aligned textual description of the speech (similar to subtitles) and treat it as a separate input alongside video frames and the corresponding audio track.

Ranked #11 on Dense Video Captioning on ActivityNet Captions

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

431

Paper
Code

DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction

2 code implementations • 25 May 2019 • Hamed R. -Tavakoli, Ali Borji, Esa Rahtu, Juho Kannala

Our results suggest that (1) audio is a strong contributing cue for saliency prediction, (2) salient visible sound-source is the natural cause of the superiority of our Audio-Visual model, (3) richer feature representations for the input space leads to more powerful predictions even in absence of more sophisticated saliency decoders, and (4) Audio-Visual model improves over 53. 54\% of the frames predicted by the best Visual model (our baseline).

Saliency Prediction Video Saliency Prediction

Paper
Code

Multimodal Machine Learning-based Knee Osteoarthritis Progression Prediction from Plain Radiographs and Clinical Data

1 code implementation • 12 Apr 2019 • Aleksei Tiulpin, Stefan Klein, Sita M. A. Bierma-Zeinstra, Jérôme Thevenot, Esa Rahtu, Joyce van Meurs, Edwin H. G. Oei, Simo Saarakkala

Knee osteoarthritis (OA) is the most common musculoskeletal disease without a cure, and current treatment options are limited to symptomatic relief.

BIG-bench Machine Learning Knee Osteoarthritis Prediction

Paper
Code

Digging Deeper into Egocentric Gaze Prediction

no code implementations • 12 Apr 2019 • Hamed R. -Tavakoli, Esa Rahtu, Juho Kannala, Ali Borji

Extensive experiments over multiple datasets reveal that (1) spatial biases are strong in egocentric videos, (2) bottom-up saliency models perform poorly in predicting gaze and underperform spatial biases, (3) deep features perform better compared to traditional features, (4) as opposed to hand regions, the manipulation point is a strong influential cue for gaze prediction, (5) combining the proposed recurrent model with bottom-up cues, vanishing points and, in particular, manipulation point results in the best gaze prediction accuracy over egocentric videos, (6) the knowledge transfer works best for cases where the tasks or sequences are similar, and (7) task and activity recognition can benefit from gaze prediction.

Activity Recognition Gaze Prediction +2

Paper
Add Code

Predicting Novel Views Using Generative Adversarial Query Network

no code implementations • 10 Apr 2019 • Phong Nguyen-Ha, Lam Huynh, Esa Rahtu, Janne Heikkila

The problem of predicting a novel view of the scene using an arbitrary number of observations is a challenging problem for computers as well as for humans.

Novel View Synthesis

Paper
Add Code

ICface: Interpretable and Controllable Face Reenactment Using GANs

1 code implementation • 3 Apr 2019 • Soumya Tripathy, Juho Kannala, Esa Rahtu

This paper presents a generic face animator that is able to control the pose and expressions of a given face image.

Face Reenactment Video Editing

156

Paper
Code

Rethinking the Evaluation of Video Summaries

2 code implementations • CVPR 2019 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä

Video summarization is a technique to create a short skim of the original video while preserving the main stories/content.

Video Segmentation Video Semantic Segmentation +1

Paper
Code

DGC-Net: Dense Geometric Correspondence Network

4 code implementations • 19 Oct 2018 • Iaroslav Melekhov, Aleksei Tiulpin, Torsten Sattler, Marc Pollefeys, Esa Rahtu, Juho Kannala

This paper addresses the challenge of dense pixel correspondence estimation between two images.

Ranked #2 on Dense Pixel Correspondence Estimation on HPatches

Dense Pixel Correspondence Estimation Optical Flow Estimation +1

203

Paper
Code

ADVIO: An authentic dataset for visual-inertial odometry

1 code implementation • ECCV 2018 • Santiago Cortés, Arno Solin, Esa Rahtu, Juho Kannala

The lack of realistic and open benchmarking datasets for pedestrian visual-inertial odometry has made it hard to pinpoint differences in published methods.

Benchmarking

232

Paper
Code

Learning image-to-image translation using paired and unpaired training samples

1 code implementation • 8 May 2018 • Soumya Tripathy, Juho Kannala, Esa Rahtu

In this paper, we propose a new general purpose image-to-image translation model that is able to utilize both paired and unpaired training data simultaneously.

Image-to-Image Translation Translation

Paper
Code

Image Patch Matching Using Convolutional Descriptors with Euclidean Distance

no code implementations • 31 Oct 2017 • Iaroslav Melekhov, Juho Kannala, Esa Rahtu

In this work we propose a neural network based image descriptor suitable for image patch matching, which is an important task in many computer vision applications.

object-detection Object Detection +1

Paper
Add Code

Automatic Knee Osteoarthritis Diagnosis from Plain Radiographs: A Deep Learning-Based Approach

1 code implementation • 29 Oct 2017 • Aleksei Tiulpin, Jérôme Thevenot, Esa Rahtu, Petri Lehenkari, Simo Saarakkala

Here, we also report a radiological OA diagnosis area under the ROC curve of 0. 93.

Decision Making

Paper
Code

Summarization of User-Generated Sports Video by Using Deep Action Recognition Features

no code implementations • 25 Sep 2017 • Antonio Tejero-de-Pablos, Yuta Nakashima, Tomokazu Sato, Naokazu Yokoya, Marko Linna, Esa Rahtu

The labels are provided by annotators possessing different experience with respect to Kendo to demonstrate how the proposed method adapts to different needs.

Action Recognition Temporal Action Localization +1

Paper
Add Code

PIVO: Probabilistic Inertial-Visual Odometry for Occlusion-Robust Navigation

no code implementations • 2 Aug 2017 • Arno Solin, Santiago Cortes, Esa Rahtu, Juho Kannala

This paper presents a novel method for visual-inertial odometry.

Visual Odometry

Paper
Add Code

Investigating Natural Image Pleasantness Recognition using Deep Features and Eye Tracking for Loosely Controlled Human-computer Interaction

no code implementations • 7 Apr 2017 • Hamed R. -Tavakoli, Jorma Laaksonen, Esa Rahtu

To investigate the current status in regard to affective image tagging, we (1) introduce a new eye movement dataset using an affordable eye tracker, (2) study the use of deep neural networks for pleasantness recognition, (3) investigate the gap between deep features and eye movements.

Paper
Add Code

Image-based Localization using Hourglass Networks

no code implementations • 23 Mar 2017 • Iaroslav Melekhov, Juha Ylioinas, Juho Kannala, Esa Rahtu

In this paper, we propose an encoder-decoder convolutional neural network (CNN) architecture for estimating camera pose (orientation and location) from a single RGB-image.

General Classification Image-Based Localization +1

Paper
Add Code

Inertial Odometry on Handheld Smartphones

1 code implementation • 1 Mar 2017 • Arno Solin, Santiago Cortes, Esa Rahtu, Juho Kannala

Building a complete inertial navigation system using the limited quality data provided by current smartphones has been regarded challenging, if not impossible.

Paper
Code

Relative Camera Pose Estimation Using Convolutional Neural Networks

1 code implementation • 5 Feb 2017 • Iaroslav Melekhov, Juha Ylioinas, Juho Kannala, Esa Rahtu

This paper presents a convolutional neural network based approach for estimating the relative pose between two cameras.

General Classification Pose Estimation +2

Paper
Code

A novel method for automatic localization of joint area on knee plain radiographs

no code implementations • 31 Jan 2017 • Aleksei Tiulpin, Jérôme Thevenot, Esa Rahtu, Simo Saarakkala

The obtained results for the used datasets show the mean intersection over the union equal to: 0. 84, 0. 79 and 0. 78.

Paper
Add Code

Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features

1 code implementation • 20 Oct 2016 • Hamed R. -Tavakoli, Ali Borji, Jorma Laaksonen, Esa Rahtu

This paper presents a novel fixation prediction and saliency modeling framework based on inter-image similarities and ensemble of Extreme Learning Machines (ELM).

Paper
Code

Video Summarization using Deep Semantic Features

2 code implementations • 28 Sep 2016 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya

For this, we design a deep neural network that maps videos as well as descriptions to a common semantic space and jointly trained it with associated pairs of videos and descriptions.

Clustering Video Summarization

Paper
Code

Real-time Human Pose Estimation from Video with Convolutional Neural Networks

no code implementations • 23 Sep 2016 • Marko Linna, Juho Kannala, Esa Rahtu

In this paper, we present a method for real-time multi-person human pose estimation from video by utilizing convolutional neural networks.

Action Recognition Pose Estimation +1

Paper
Add Code

Learning Joint Representations of Videos and Sentences with Web Image Search

no code implementations • 8 Aug 2016 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya

In description generation, the performance level is comparable to the current state-of-the-art, although our embeddings were trained for the retrieval tasks.

Image Retrieval Natural Language Queries +5

Paper
Add Code

Generating Object Segmentation Proposals using Global and Local Search

no code implementations • CVPR 2014 • Pekka Rantalankila, Juho Kannala, Esa Rahtu

The parameters of the graph cut problems are learnt in such a manner that they provide complementary sets of regions.

Object object-detection +3

Paper
Add Code

Understanding Objects in Detail with Fine-Grained Attributes

no code implementations • CVPR 2014 • Andrea Vedaldi, Siddharth Mahendran, Stavros Tsogkas, Subhransu Maji, Ross Girshick, Juho Kannala, Esa Rahtu, Iasonas Kokkinos, Matthew B. Blaschko, David Weiss, Ben Taskar, Karen Simonyan, Naomi Saphra, Sammy Mohamed

We show that the collected data can be used to study the relation between part detection and attribute prediction by diagnosing the performance of classifiers that pool information from different parts of an object.

Attribute Object +2

Paper
Add Code

Fine-Grained Visual Classification of Aircraft

no code implementations • 21 Jun 2013 • Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, Andrea Vedaldi

This paper introduces FGVC-Aircraft, a new dataset containing 10, 000 images of aircraft spanning 100 aircraft models, organised in a three-level hierarchy.

Classification Fine-Grained Image Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.