no code implementations • 20 Nov 2023 • Jia-Hong Huang, Chao-Han Huck Yang, Pin-Yu Chen, Min-Hung Chen, Marcel Worring
The aim of video summarization is to shorten videos automatically while retaining the key information necessary to convey the overall story.
no code implementations • 4 Jul 2023 • Jia-Hong Huang, Luka Murn, Marta Mrak, Marcel Worring
Existing datasets for manually labelled query-based video summarization are costly and thus small, limiting the performance of supervised deep video summarization models.
no code implementations • 4 Jul 2023 • Jia-Hong Huang, Chao-Han Huck Yang, Pin-Yu Chen, Andrew Brown, Marcel Worring
Multi-modal video summarization has a video input and a text-based query input.
no code implementations • 30 Apr 2023 • Jia-Hong Huang, Chao-Han Huck Yang, Pin-Yu Chen, Min-Hung Chen, Marcel Worring
In this work, a Causal Explainer, dubbed Causalainer, is proposed to address this issue.
no code implementations • 6 Apr 2023 • Jia-Hong Huang, Modar Alfadly, Bernard Ghanem, Marcel Worring
This work proposes a new method that utilizes semantically related questions, referred to as basic questions, acting as noise to evaluate the robustness of VQA models.
2 code implementations • 13 Oct 2021 • Riccardo Di Sipio, Jia-Hong Huang, Samuel Yen-Chi Chen, Stefano Mangini, Marcel Worring
In this paper, we discuss the initial attempts at boosting understanding human language based on deep-learning models with quantum computing.
no code implementations • 30 May 2021 • Jia-Hong Huang, Ting-Wei Wu, Chao-Han Huck Yang, Marcel Worring
Automatically generating medical reports for retinal images is one of the promising ways to help ophthalmologists reduce their workload and improve work efficiency.
no code implementations • 26 Apr 2021 • Jia-Hong Huang, Ting-Wei Wu, Marcel Worring
A traditional medical image captioning model creates a medical description only based on a single medical image input.
2 code implementations • 26 Apr 2021 • Jia-Hong Huang, Luka Murn, Marta Mrak, Marcel Worring
Traditional video summarization methods generate fixed video representations regardless of user interest.
1 code implementation • 1 Nov 2020 • Jia-Hong Huang, Chao-Han Huck Yang, Fangyu Liu, Meng Tian, Yi-Chieh Liu, Ting-Wei Wu, I-Hung Lin, Kang Wang, Hiromasa Morikawa, Hernghua Chang, Jesper Tegner, Marcel Worring
To train and validate the effectiveness of our DNN-based module, we propose a large-scale retinal disease image dataset.
1 code implementation • 7 Apr 2020 • Jia-Hong Huang, Marcel Worring
In this work, we introduce a method which takes a text-based query as input and generates a video summary corresponding to it.
no code implementations • 30 Nov 2019 • Jia-Hong Huang, Modar Alfadly, Bernard Ghanem, Marcel Worring
In this work, we propose a new method that uses semantically related questions, dubbed basic questions, acting as noise to evaluate the robustness of VQA models.
1 code implementation • 11 Feb 2019 • Yi-Chieh Liu, Hao-Hsiang Yang, Chao-Han Huck Yang, Jia-Hong Huang, Meng Tian, Hiromasa Morikawa, Yi-Chang James Tsai, Jesper Tegner
Age-Related Macular Degeneration (AMD) is an asymptomatic retinal disease which may result in loss of vision.
1 code implementation • 16 Aug 2018 • C. -H. Huck Yang, Fangyu Liu, Jia-Hong Huang, Meng Tian, Hiromasa Morikawa, I-Hung Lin, Yi-Chieh Liu, Hao-Hsiang Yang, Jesper Tegner
Automatic clinical diagnosis of retinal diseases has emerged as a promising approach to facilitate discovery in areas with limited access to specialists.
1 code implementation • 17 Jun 2018 • C. -H. Huck Yang, Jia-Hong Huang, Fangyu Liu, Fang-Yi Chiu, Mengya Gao, Weifeng Lyu, I-Hung Lin M. D., Jesper Tegner
Automatic clinical diagnosis of retinal diseases has emerged as a promising approach to facilitate discovery in areas with limited access to specialists.
no code implementations • 16 Nov 2017 • Jia-Hong Huang, Cuong Duc Dao, Modar Alfadly, Bernard Ghanem
In VQA, adversarial attacks can target the image and/or the proposed main question and yet there is a lack of proper analysis of the later.
no code implementations • 14 Sep 2017 • Jia-Hong Huang, Cuong Duc Dao, Modar Alfadly, C. Huck Yang, Bernard Ghanem
Visual Question Answering (VQA) models should have both high robustness and accuracy.
no code implementations • 19 Mar 2017 • Jia-Hong Huang, Modar Alfadly, Bernard Ghanem
Given a natural language question about an image, the first module takes the question as input and then outputs the basic questions of the main given question.