Search Results for author: Jorma Laaksonen

Found 20 papers, 6 papers with code

CLIP4IDC: CLIP for Image Difference Captioning

1 code implementation1 Jun 2022 Zixin Guo, Tzu-Jui Julius Wang, Jorma Laaksonen

The conventional approaches learn captioning models on the offline-extracted visual features and the learning can not be propagated back to the fixed feature extractors pre-trained on image classification datasets.

Domain Adaptation Image Classification

DoodleFormer: Creative Sketch Drawing with Transformers

1 code implementation6 Dec 2021 Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg

Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects.

Image Generation

Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models

no code implementations18 Aug 2020 Tzu-Jui Julius Wang, Selen Pehlivan, Jorma Laaksonen

Recent scene graph generation (SGG) models have shown their capability of capturing the most frequent relations among visual entities.

Graph Generation Scene Graph Generation

Character-Centric Storytelling

no code implementations17 Sep 2019 Aditya Surikuchi, Jorma Laaksonen

Sequential vision-to-language or visual storytelling has recently been one of the areas of focus in computer vision and language modeling domains.

Language Modelling Visual Storytelling

Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification

no code implementations5 Jun 2017 Rao Muhammad Anwer, Fahad Shahbaz Khan, Joost Van de Weijer, Matthieu Molinier, Jorma Laaksonen

To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification.

Aerial Scene Classification General Classification +2

Saliency Revisited: Analysis of Mouse Movements versus Fixations

no code implementations CVPR 2017 Hamed R. -Tavakoli, Fawad Ahmed, Ali Borji, Jorma Laaksonen

This paper revisits visual saliency prediction by evaluating the recent advancements in this field such as crowd-sourced mouse tracking-based databases and contextual annotations.

Model Selection Saliency Prediction

Towards Instance Segmentation with Object Priority: Prominent Object Detection and Recognition

no code implementations24 Apr 2017 Hamed R. -Tavakoli, Jorma Laaksonen

The motivation behind such a problem formulation is (1) the benefits to the knowledge representation-based vision pipelines, and (2) the potential improvements in emulating bio-inspired vision systems by solving these three problems together.

Instance Segmentation object-detection +4

Paying Attention to Descriptions Generated by Image Captioning Models

2 code implementations ICCV 2017 Hamed R. -Tavakoli, Rakshith Shetty, Ali Borji, Jorma Laaksonen

To bridge the gap between humans and machines in image understanding and describing, we need further insight into how people describe a perceived scene.

Image Captioning

Investigating Natural Image Pleasantness Recognition using Deep Features and Eye Tracking for Loosely Controlled Human-computer Interaction

no code implementations7 Apr 2017 Hamed R. -Tavakoli, Jorma Laaksonen, Esa Rahtu

To investigate the current status in regard to affective image tagging, we (1) introduce a new eye movement dataset using an affordable eye tracker, (2) study the use of deep neural networks for pleasantness recognition, (3) investigate the gap between deep features and eye movements.

Scale Coding Bag of Deep Features for Human Attribute and Action Recognition

no code implementations14 Dec 2016 Fahad Shahbaz Khan, Joost Van de Weijer, Rao Muhammad Anwer, Andrew D. Bagdanov, Michael Felsberg, Jorma Laaksonen

Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding.

Action Recognition In Still Images

Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features

1 code implementation20 Oct 2016 Hamed R. -Tavakoli, Ali Borji, Jorma Laaksonen, Esa Rahtu

This paper presents a novel fixation prediction and saliency modeling framework based on inter-image similarities and ensemble of Extreme Learning Machines (ELM).

Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation

1 code implementation17 Aug 2016 Rakshith Shetty, Jorma Laaksonen

We present our submission to the Microsoft Video to Language Challenge of generating short captions describing videos in the challenge dataset.

Video Captioning

Video captioning with recurrent networks based on frame- and video-level features and visual content classification

2 code implementations9 Dec 2015 Rakshith Shetty, Jorma Laaksonen

In this paper, we describe the system for generating textual descriptions of short video clips using recurrent neural networks (RNN), which we used while participating in the Large Scale Movie Description Challenge 2015 in ICCV 2015.

General Classification Image Captioning +1

PinView: Implicit Feedback in Content-Based Image Retrieval

no code implementations2 Oct 2014 Zakria Hussain, Arto Klami, Jussi Kujala, Alex P. Leung, Kitsuchart Pasupa, Peter Auer, Samuel Kaski, Jorma Laaksonen, John Shawe-Taylor

It then retrieves images with a specialized online learning algorithm that balances the tradeoff between exploring new images and exploiting the already inferred interests of the user.

Content-Based Image Retrieval online learning

S-pot - a benchmark in spotting signs within continuous signing

no code implementations LREC 2014 Ville Viitaniemi, Tommi Jantunen, Leena Savolainen, Matti Karppa, Jorma Laaksonen

In this paper we present S-pot, a benchmark setting for evaluating the performance of automatic spotting of signs in continuous sign language videos.

SLMotion - An extensible sign language oriented video analysis tool

no code implementations LREC 2014 Matti Karppa, Ville Viitaniemi, Marcos Luzardo, Jorma Laaksonen, Tommi Jantunen

We present a software toolkit called SLMotion which provides a framework for automatic and semiautomatic analysis, feature extraction and annotation of individual sign language videos, and which can easily be adapted to batch processing of entire sign language corpora.

Sign Language Recognition

Comparing computer vision analysis of signed language video with motion capture recordings

no code implementations LREC 2012 Matti Karppa, Tommi Jantunen, Ville Viitaniemi, Jorma Laaksonen, Birgitta Burger, Danny De Weerdt

We consider a non-intrusive computer-vision method for measuring the motion of a person performing natural signing in video recordings.

Cannot find the paper you are looking for? You can Submit a new open access paper.