Search Results for author: Jorma Laaksonen

Found 33 papers, 12 papers with code

Bilateral Reference for High-Resolution Dichotomous Image Segmentation

1 code implementation • 7 Jan 2024 • Peng Zheng, Dehong Gao, Deng-Ping Fan, Li Liu, Jorma Laaksonen, Wanli Ouyang, Nicu Sebe

It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef).

Ranked #1 on RGB Salient Object Detection on HRSOD (using extra training data)

Camouflaged Object Segmentation Dichotomous Image Segmentation +3

153

Paper
Code

Semi-Supervised learning for Face Anti-Spoofing using Apex frame

no code implementations • 10 Sep 2023 • Usman Muhammad, Mourad Oussalah, Jorma Laaksonen

Conventional feature extraction techniques in the face anti-spoofing domain either analyze the entire video sequence or focus on a specific segment to improve model performance.

Face Anti-Spoofing

Paper
Add Code

Saliency-based Video Summarization for Face Anti-spoofing

no code implementations • 23 Aug 2023 • Usman Muhammad, Mourad Oussalah, Jorma Laaksonen

Inspired by the visual saliency theory, we present a video summarization method for face anti-spoofing detection that aims to enhance the performance and efficiency of deep learning models by leveraging visual saliency.

Face Anti-Spoofing Face Presentation Attack Detection +1

Paper
Add Code

PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting

no code implementations • 14 Jul 2023 • Zixin Guo, Tzu-Jui Julius Wang, Selen Pehlivan, Abduljalil Radman, Jorma Laaksonen

To further reduce the amount of supervision, we propose Prompts-in-The-Loop (PiTL) that prompts knowledge from large language models (LLMs) to describe images.

Cross-Modal Retrieval Object +1

Paper
Add Code

Deep Ensemble Learning with Frame Skipping for Face Anti-Spoofing

2 code implementations • 6 Jul 2023 • Usman Muhammad, Md Ziaul Hoque, Mourad Oussalah, Jorma Laaksonen

Face presentation attacks (PA), also known as spoofing attacks, pose a substantial threat to biometric systems that rely on facial recognition systems, such as access control systems, mobile payments, and identity verification systems.

Ensemble Learning Face Anti-Spoofing +1

205

Paper
Code

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models

1 code implementation • 13 Jun 2023 • Omkar Thawkar, Abdelrahman Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen, Fahad Shahbaz Khan

The latest breakthroughs in large vision-language models, such as Bard and GPT-4, have showcased extraordinary abilities in performing a wide range of tasks.

Language Modelling Large Language Model

422

Paper
Code

Cross-modulated Few-shot Image Generation for Colorectal Tissue Classification

1 code implementation • 4 Apr 2023 • Amandeep Kumar, Ankan Kumar Bhunia, Sanath Narayan, Hisham Cholakkal, Rao Muhammad Anwer, Jorma Laaksonen, Fahad Shahbaz Khan

In this work, we propose a few-shot colorectal tissue image generation method for addressing the scarcity of histopathological training data for rare cancer tissues.

Data Augmentation Image Classification +1

Paper
Code

Video Instance Segmentation in an Open-World

1 code implementation • 3 Apr 2023 • Omkar Thawakar, Sanath Narayan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen, Mubarak Shah, Fahad Shahbaz Khan

Open-world formulation relaxes the close-world static-learning assumption as follows: (a) first, it distinguishes a set of known categories as well as labels an unknown object as `unknown' and then (b) it incrementally learns the class of an unknown as and when the corresponding semantic labels become available.

Instance Segmentation Semantic Segmentation +1

Paper
Code

3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers

1 code implementation • 21 Mar 2023 • Omkar Thawakar, Rao Muhammad Anwer, Jorma Laaksonen, Orly Reiner, Mubarak Shah, Fahad Shahbaz Khan

Accurate 3D mitochondria instance segmentation in electron microscopy (EM) is a challenging problem and serves as a prerequisite to empirically analyze their distributions and morphology.

Instance Segmentation Semantic Segmentation

Paper
Code

Domain Generalization via Ensemble Stacking for Face Presentation Attack Detection

no code implementations • 5 Jan 2023 • Usman Muhammad, Jorma Laaksonen, Djamila Romaissa Beddiar, Mourad Oussalah

The latter combines the predictions from the base models, leveraging their complementary information to better handle unseen target domains and enhance the overall performance.

Domain Generalization Ensemble Learning +3

Paper
Add Code

Person Image Synthesis via Denoising Diffusion Model

1 code implementation • CVPR 2023 • Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Jorma Laaksonen, Mubarak Shah, Fahad Shahbaz Khan

In this work, we show how denoising diffusion models can be applied for high-fidelity person image synthesis with strong sample diversity and enhanced mode coverage of the learnt data distribution.

Denoising Image Generation

456

Paper
Code

When to Laugh and How Hard? A Multimodal Approach to Detecting Humor and its Intensity

no code implementations • COLING 2022 • Khalid Alnajjar, Mika Hämäläinen, Jörg Tiedemann, Jorma Laaksonen, Mikko Kurimo

Our results show that the model is capable of correctly detecting whether an utterance is humorous 78% of the time and how long the audience's laughter reaction should last with a mean absolute error of 600 milliseconds.

Paper
Add Code

Learning by Hallucinating: Vision-Language Pre-training with Weak Supervision

no code implementations • 24 Oct 2022 • Tzu-Jui Julius Wang, Jorma Laaksonen, Tomas Langer, Heikki Arponen, Tom E. Bishop

Moreover, in other V-L downstream tasks considered, our WFH models are on par with models trained with paired V-L data, revealing the utility of unpaired data.

Cross-Modal Retrieval Image Retrieval +3

Paper
Add Code

CLIP4IDC: CLIP for Image Difference Captioning

1 code implementation • 1 Jun 2022 • Zixin Guo, Tzu-Jui Julius Wang, Jorma Laaksonen

Different from directly fine-tuning CLIP to generate sentences, we introduce an adaptation training process to adapt CLIP's visual encoder to capture and align differences in image pairs based on the textual descriptions.

Domain Adaptation Image Classification

Paper
Code

DoodleFormer: Creative Sketch Drawing with Transformers

no code implementations • 6 Dec 2021 • Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg

Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects.

Image Generation

Paper
Add Code

Tackling the Unannotated: Scene Graph Generation with Bias-Reduced Models

no code implementations • 18 Aug 2020 • Tzu-Jui Julius Wang, Selen Pehlivan, Jorma Laaksonen

Recent scene graph generation (SGG) models have shown their capability of capturing the most frequent relations among visual entities.

Graph Generation Scene Graph Generation +1

Paper
Add Code

Deep Contextual Attention for Human-Object Interaction Detection

no code implementations • ICCV 2019 • Tiancai Wang, Rao Muhammad Anwer, Muhammad Haris Khan, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Jorma Laaksonen

Our approach outperforms the state-of-the-art on all datasets.

Human-Object Interaction Detection Object +3

Paper
Add Code

Character-Centric Storytelling

no code implementations • 17 Sep 2019 • Aditya Surikuchi, Jorma Laaksonen

Sequential vision-to-language or visual storytelling has recently been one of the areas of focus in computer vision and language modeling domains.

Language Modelling Visual Storytelling

Paper
Add Code

The MeMAD Submission to the WMT18 Multimodal Translation Task

no code implementations • WS 2018 • Stig-Arne Grönroos, Benoit Huet, Mikko Kurimo, Jorma Laaksonen, Bernard Merialdo, Phu Pham, Mats Sjöberg, Umut Sulubacak, Jörg Tiedemann, Raphael Troncy, Raúl Vázquez

Our experiments show that the effect of the visual features in our system is small.

Multimodal Machine Translation NMT +1

Paper
Add Code

Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification

no code implementations • 5 Jun 2017 • Rao Muhammad Anwer, Fahad Shahbaz Khan, Joost Van de Weijer, Matthieu Molinier, Jorma Laaksonen

To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification.

Ranked #12 on Aerial Scene Classification on AID (20% as trainset)

Aerial Scene Classification General Classification +2

Paper
Add Code

Saliency Revisited: Analysis of Mouse Movements versus Fixations

no code implementations • CVPR 2017 • Hamed R. -Tavakoli, Fawad Ahmed, Ali Borji, Jorma Laaksonen

This paper revisits visual saliency prediction by evaluating the recent advancements in this field such as crowd-sourced mouse tracking-based databases and contextual annotations.

Model Selection Saliency Prediction

Paper
Add Code

Towards Instance Segmentation with Object Priority: Prominent Object Detection and Recognition

no code implementations • 24 Apr 2017 • Hamed R. -Tavakoli, Jorma Laaksonen

The motivation behind such a problem formulation is (1) the benefits to the knowledge representation-based vision pipelines, and (2) the potential improvements in emulating bio-inspired vision systems by solving these three problems together.

Instance Segmentation Object +5

Paper
Add Code

Paying Attention to Descriptions Generated by Image Captioning Models

2 code implementations • ICCV 2017 • Hamed R. -Tavakoli, Rakshith Shetty, Ali Borji, Jorma Laaksonen

To bridge the gap between humans and machines in image understanding and describing, we need further insight into how people describe a perceived scene.

Image Captioning

Paper
Code

Investigating Natural Image Pleasantness Recognition using Deep Features and Eye Tracking for Loosely Controlled Human-computer Interaction

no code implementations • 7 Apr 2017 • Hamed R. -Tavakoli, Jorma Laaksonen, Esa Rahtu

To investigate the current status in regard to affective image tagging, we (1) introduce a new eye movement dataset using an affordable eye tracker, (2) study the use of deep neural networks for pleasantness recognition, (3) investigate the gap between deep features and eye movements.

Paper
Add Code

Scale Coding Bag of Deep Features for Human Attribute and Action Recognition

no code implementations • 14 Dec 2016 • Fahad Shahbaz Khan, Joost Van de Weijer, Rao Muhammad Anwer, Andrew D. Bagdanov, Michael Felsberg, Jorma Laaksonen

Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding.

Action Recognition In Still Images Attribute

Paper
Add Code

Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features

1 code implementation • 20 Oct 2016 • Hamed R. -Tavakoli, Ali Borji, Jorma Laaksonen, Esa Rahtu

This paper presents a novel fixation prediction and saliency modeling framework based on inter-image similarities and ensemble of Extreme Learning Machines (ELM).

Paper
Code

Frame- and Segment-Level Features and Candidate Pool Evaluation for Video Caption Generation

1 code implementation • 17 Aug 2016 • Rakshith Shetty, Jorma Laaksonen

We present our submission to the Microsoft Video to Language Challenge of generating short captions describing videos in the challenge dataset.

Caption Generation Video Captioning

Paper
Code

Video captioning with recurrent networks based on frame- and video-level features and visual content classification

2 code implementations • 9 Dec 2015 • Rakshith Shetty, Jorma Laaksonen

In this paper, we describe the system for generating textual descriptions of short video clips using recurrent neural networks (RNN), which we used while participating in the Large Scale Movie Description Challenge 2015 in ICCV 2015.

Caption Generation General Classification +2

Paper
Code

Towards Reliable Automatic Multimodal Content Analysis

no code implementations • WS 2015 • Olli-Philippe Lautenbacher, Liisa Tiittula, Maija Hirvonen, Jorma Laaksonen, Mikko Kurimo

Paper
Add Code

PinView: Implicit Feedback in Content-Based Image Retrieval

no code implementations • 2 Oct 2014 • Zakria Hussain, Arto Klami, Jussi Kujala, Alex P. Leung, Kitsuchart Pasupa, Peter Auer, Samuel Kaski, Jorma Laaksonen, John Shawe-Taylor

It then retrieves images with a specialized online learning algorithm that balances the tradeoff between exploring new images and exploiting the already inferred interests of the user.

Content-Based Image Retrieval Retrieval

Paper
Add Code

S-pot - a benchmark in spotting signs within continuous signing

no code implementations • LREC 2014 • Ville Viitaniemi, Tommi Jantunen, Leena Savolainen, Matti Karppa, Jorma Laaksonen

In this paper we present S-pot, a benchmark setting for evaluating the performance of automatic spotting of signs in continuous sign language videos.

Paper
Add Code

SLMotion - An extensible sign language oriented video analysis tool

no code implementations • LREC 2014 • Matti Karppa, Ville Viitaniemi, Marcos Luzardo, Jorma Laaksonen, Tommi Jantunen

We present a software toolkit called SLMotion which provides a framework for automatic and semiautomatic analysis, feature extraction and annotation of individual sign language videos, and which can easily be adapted to batch processing of entire sign language corpora.

Sign Language Recognition

Paper
Add Code

Comparing computer vision analysis of signed language video with motion capture recordings

no code implementations • LREC 2012 • Matti Karppa, Tommi Jantunen, Ville Viitaniemi, Jorma Laaksonen, Birgitta Burger, Danny De Weerdt

We consider a non-intrusive computer-vision method for measuring the motion of a person performing natural signing in video recordings.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.