1 code implementation • LREC 2022 • Haruya Suzuki, Yuto Miyauchi, Kazuki Akiyama, Tomoyuki Kajiwara, Takashi Ninomiya, Noriko Takemura, Yuta Nakashima, Hajime Nagahara
We annotate 35, 000 SNS posts with both the writer’s subjective sentiment polarity labels and the reader’s objective ones to construct a Japanese sentiment analysis dataset.
1 code implementation • EMNLP (WNUT) 2020 • Sora Ohashi, Tomoyuki Kajiwara, Chenhui Chu, Noriko Takemura, Yuta Nakashima, Hajime Nagahara
We introduce the IDSOU submission for the WNUT-2020 task 2: identification of informative COVID-19 English Tweets.
1 code implementation • 9 Dec 2024 • Patrick Ramos, Nicolas Gonthier, Selina Khan, Yuta Nakashima, Noa Garcia
Object detection in art is a valuable tool for the digital humanities, as it allows for faster identification of objects in artistic and historical images compared to humans.
no code implementations • 5 Dec 2024 • Jiahao Zhang, Ryota Yoshihashi, Shunsuke Kitada, Atsuki Osanai, Yuta Nakashima
To answer this, we propose Visual-Aware Self-Correction LAyout GeneRation (VASCAR) for LVLM-based content-aware layout generation.
no code implementations • 14 Oct 2024 • Zhouqiang Jiang, Bowen Wang, JunHao Chen, Yuta Nakashima
Recent approaches for visually-rich document understanding (VrDU) uses manually annotated semantic groups, where a semantic group encompasses all semantically relevant but not obviously grouped words.
1 code implementation • 20 Aug 2024 • JunHao Chen, Bowen Wang, Zhouqiang Jiang, Yuta Nakashima
By enhancing the intelligibility of human questions for black-box LLMs, our question rewriter improves the quality of generated answers.
no code implementations • 19 Aug 2024 • Yusuke Hirota, Min-Hung Chen, Chien-Yi Wang, Yuta Nakashima, Yu-Chiang Frank Wang, Ryo Hachiuma
To mitigate societal bias in CLIP and overcome these limitations simultaneously, we introduce a simple-yet-effective debiasing method called SANER (societal attribute neutralizer) that eliminates attribute information from CLIP text features only of attribute-neutral descriptions.
1 code implementation • 4 Aug 2024 • Bowen Wang, Jiuyang Chang, Yiming Qian, Guoxin Chen, JunHao Chen, Zhouqiang Jiang, Jiahao Zhang, Yuta Nakashima, Hajime Nagahara
Large language models (LLMs) have recently showcased remarkable capabilities, spanning a wide range of tasks and applications, including those in the medical domain.
no code implementations • 8 Jul 2024 • Bowen Wang, Liangzhi Li, Jiahao Zhang, Yuta Nakashima, Hajime Nagahara
A novel loss function specifically for ESCOUTER is designed to fine-tune the model's behavior, enabling it to toggle between positive and negative explanations.
Explainable artificial intelligence
Explainable Artificial Intelligence (XAI)
no code implementations • 4 Jul 2024 • Yusuke Hirota, Jerone T. A. Andrews, Dora Zhao, Orestis Papakyriakopoulos, Apostolos Modas, Yuta Nakashima, Alice Xiang
We tackle societal bias in image-text datasets by removing spurious correlations between protected groups and image attributes.
no code implementations • 20 Jun 2024 • Yusuke Hirota, Ryo Hachiuma, Chao-Han Huck Yang, Yuta Nakashima
Large language models (LLMs) have enhanced the capacity of vision-language models to caption visual text.
1 code implementation • 14 Jun 2024 • Wanqing Zhao, Yuta Nakashima, Haiyuan Chen, Noboru Babaguchi
Our method consistently improves the performance over the state-of-the-art methods on all benchmark datasets and effectively demonstrates its aptitude for generalizing fake news detection in social media.
no code implementations • CVPR 2024 • Tianwei Chen, Yusuke Hirota, Mayu Otani, Noa Garcia, Yuta Nakashima
We investigate the impact of deep generative models on potential social biases in upcoming computer vision models.
no code implementations • 5 Dec 2023 • Yankun Wu, Yuta Nakashima, Noa Garcia
Several studies have raised awareness about social biases in image generative models, demonstrating their predisposition towards stereotypes and imbalances.
1 code implementation • 7 Nov 2023 • Jiahao Zhang, Bowen Wang, Liangzhi Li, Yuta Nakashima, Hajime Nagahara
Our findings suggest that InMeMo offers a versatile and efficient way to enhance the performance of visual ICL with lightweight training.
1 code implementation • 27 Sep 2023 • Bowen Wang, Jiaxing Zhang, Ran Zhang, Yunqin Li, Liangzhi Li, Yuta Nakashima
We introduce a new pipeline known as Revision-based Transformer Facade Parsing (RTFP).
1 code implementation • CVPR 2023 • Bowen Wang, Liangzhi Li, Yuta Nakashima, Hajime Nagahara
Using some image classification tasks as our testbed, we demonstrate BotCL's potential to rebuild neural networks for better interpretability.
1 code implementation • 20 Apr 2023 • Yankun Wu, Yuta Nakashima, Noa Garcia
The duality of content and style is inherent to the nature of art.
1 code implementation • CVPR 2023 • Yusuke Hirota, Yuta Nakashima, Noa Garcia
From this observation, we hypothesize that there are two types of gender bias affecting image captioning models: 1) bias that exploits context to predict gender, and 2) bias in the probability of generating certain (often stereotypical) words because of gender.
1 code implementation • CVPR 2023 • Noa Garcia, Yusuke Hirota, Yankun Wu, Yuta Nakashima
The increasing tendency to collect large and uncurated datasets to train vision-and-language models has raised concerns about fair representations.
no code implementations • CVPR 2023 • Mayu Otani, Riku Togashi, Yu Sawai, Ryosuke Ishigami, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh
Human evaluation is critical for validating the performance of text-to-image generative models, as this highly cognitive process requires deep comprehension of text and images.
no code implementations • 31 Jan 2023 • Hugo Lemarchant, Liangzi Li, Yiming Qian, Yuta Nakashima, Hajime Nagahara
Vision Transformers (ViTs) are becoming a very popular paradigm for vision tasks as they achieve state-of-the-art performance on image classification.
1 code implementation • 18 Nov 2022 • Zongshang Pang, Yuta Nakashima, Mayu Otani, Hajime Nagahara
Video summarization aims to select the most informative subset of frames in a video to facilitate efficient video browsing.
no code implementations • 23 Aug 2022 • Tianwei Chen, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Hajime Nagahara
Is more data always better to train vision-and-language models?
no code implementations • 17 May 2022 • Yusuke Hirota, Yuta Nakashima, Noa Garcia
Our findings suggest that there are dangers associated to using VQA datasets without considering and dealing with the potentially harmful stereotypes.
no code implementations • CVPR 2022 • Riku Togashi, Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkila, Tetsuya Sakai
First, it is rank-insensitive: It ignores the rank positions of successfully localised moments in the top-$K$ ranked list by treating the list as a set.
1 code implementation • CVPR 2022 • Yusuke Hirota, Yuta Nakashima, Noa Garcia
We study societal bias amplification in image captioning.
1 code implementation • CVPR 2022 • Mayu Otani, Riku Togashi, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh
OC-cost computes the cost of correcting detections to ground truths as a measure of accuracy.
no code implementations • 26 Oct 2021 • Tianran Wu, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Haruo Takemura
Video question answering (VideoQA) is designed to answer a given question based on a relevant video clip.
1 code implementation • ICCV 2021 • Zechen Bai, Yuta Nakashima, Noa Garcia
Have you ever looked at a painting and wondered what is the story behind it?
no code implementations • 2 Sep 2021 • Yiming Qian, Cheikh Brahim El Vaigh, Yuta Nakashima, Benjamin Renoust, Hajime Nagahara, Yutaka Fujioka
Buddha statues are a part of human culture, especially of the Asia area, and they have been alongside human civilisation for more than 2, 000 years.
no code implementations • ACL 2021 • Jules Samaran, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima
The impressive performances of pre-trained visually grounded language models have motivated a growing body of research investigating what has been learned during the pre-training.
no code implementations • 7 Jul 2021 • Akihiko Sayo, Diego Thomas, Hiroshi Kawasaki, Yuta Nakashima, Katsushi Ikeuchi
We propose a new 2D pose refinement network that learns to predict the human bias in the estimated 2D pose.
Ranked #75 on
3D Human Pose Estimation
on Human3.6M
3D Human Pose Estimation
Multi-view 3D Human Pose Estimation
no code implementations • 25 Jun 2021 • Yusuke Hirota, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Ittetsu Taniguchi, Takao Onoye
This paper delves into the effectiveness of textual representations for image understanding in the specific context of VQA.
1 code implementation • NAACL 2021 • Tomoyuki Kajiwara, Chenhui Chu, Noriko Takemura, Yuta Nakashima, Hajime Nagahara
We annotate 17, 000 SNS posts with both the writer{'}s subjective emotional intensity and the reader{'}s objective one to construct a Japanese emotion analysis dataset.
no code implementations • 25 May 2021 • Cheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, Yuta Nakashima, Hajime Nagahara
In this paper, we propose a novel use of a knowledge graph, that is constructed on annotated data and pseudo-labeled data.
no code implementations • 28 Jan 2021 • Kiichi Goto, Taikan Suehara, Tamaki Yoshioka, Masakazu Kurata, Hajime Nagahara, Yuta Nakashima, Noriko Takemura, Masako Iwasaki
Deep learning is a rapidly-evolving technology with possibility to significantly improve physics reach of collider experiments.
no code implementations • 14 Jan 2021 • Vinay Damodaran, Sharanya Chakravarthy, Akshay Kumar, Anjana Umapathy, Teruko Mitamura, Yuta Nakashima, Noa Garcia, Chenhui Chu
Visual Question Answering (VQA) is of tremendous interest to the research community with important applications such as aiding visually impaired users and image-based search.
1 code implementation • 25 Nov 2020 • Bowen Wang, Liangzhi Li, Manisha Verma, Yuta Nakashima, Ryo Kawasaki, Hajime Nagahara
Few-shot learning (FSL) approaches are usually based on an assumption that the pre-trained knowledge can be obtained from base (seen) categories and can be well transferred to novel (unseen) categories.
Ranked #36 on
Few-Shot Image Classification
on CIFAR-FS 5-way (5-shot)
1 code implementation • 7 Nov 2020 • Liangzhi Li, Manisha Verma, Bowen Wang, Yuta Nakashima, Hajime Nagahara, Ryo Kawasaki
Our severity grading method was able to validate crossing points with precision and recall of 96. 3% and 96. 3%, respectively.
no code implementations • 19 Oct 2020 • Bowen Wang, Liangzhi Li, Yuta Nakashima, Ryo Kawasaki, Hajime Nagahara, Yasushi Yagi
Semantic video segmentation is a key challenge for various applications.
no code implementations • 11 Oct 2020 • Chenhui Chu, Yuto Takebayashi, Mishra Vipul, Yuta Nakashima
Visual relationship detection is crucial for scene understanding in images.
no code implementations • 30 Sep 2020 • Nikolai Huckle, Noa Garcia, Yuta Nakashima
Art produced today, on the other hand, is numerous and easily accessible, through the internet and social networks that are used by professional and amateur artists alike to display their work.
1 code implementation • ICCV 2021 • Liangzhi Li, Bowen Wang, Manisha Verma, Yuta Nakashima, Ryo Kawasaki, Hajime Nagahara
Explainable artificial intelligence has been gaining attention in the past few years.
1 code implementation • 1 Sep 2020 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä
In this paper, we present a series of experiments assessing how well the benchmark results reflect the true progress in solving the moment retrieval task.
1 code implementation • 28 Aug 2020 • Noa Garcia, Chentao Ye, Zihua Liu, Qingtao Hu, Mayu Otani, Chenhui Chu, Yuta Nakashima, Teruko Mitamura
Our dataset inherently consists of visual (painting-based) and knowledge (comment-based) questions.
no code implementations • 22 Jul 2020 • Sudhakar Kumawat, Manisha Verma, Yuta Nakashima, Shanmuganathan Raman
To address these issues, we propose spatio-temporal short term Fourier transform (STFT) blocks, a new class of convolutional blocks that can serve as an alternative to the 3D convolutional layer and its variants in 3D CNNs.
1 code implementation • ECCV 2020 • Noa Garcia, Yuta Nakashima
To understand movies, humans constantly reason over the dialogues and actions shown in specific scenes and relate them to the overall storyline already seen.
1 code implementation • MIDL 2019 • Liangzhi Li, Manisha Verma, Yuta Nakashima, Ryo Kawasaki, Hajime Nagahara
Retinal imaging serves as a valuable tool for diagnosis of various diseases.
no code implementations • LREC 2020 • Koji Tanaka, Chenhui Chu, Haolin Ren, Benjamin Renoust, Yuta Nakashima, Noriko Takemura, Hajime Nagahara, Takao Fujikawa
In this paper, we propose a full pipeline of analysis of a large corpus about a century of public meeting in historical Australian news papers, from construction to visual exploration.
1 code implementation • 22 Apr 2020 • Manisha Verma, Sudhakar Kumawat, Yuta Nakashima, Shanmuganathan Raman
To handle more variety in human poses, we propose the concept of fine-grained hierarchical pose classification, in which we formulate the pose estimation as a classification task, and propose a dataset, Yoga-82, for large-scale yoga pose recognition with 82 classes.
no code implementations • 17 Apr 2020 • Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima
We propose a novel video understanding task by fusing knowledge-based and video question answering.
2 code implementations • 12 Dec 2019 • Liangzhi Li, Manisha Verma, Yuta Nakashima, Hajime Nagahara, Ryo Kawasaki
Retinal vessel segmentation is of great interest for diagnosis of retinal vascular diseases.
Ranked #6 on
Retinal Vessel Segmentation
on CHASE_DB1
no code implementations • 23 Oct 2019 • Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima
We propose a novel video understanding task by fusing knowledge-based and video question answering.
no code implementations • 17 Sep 2019 • Benjamin Renoust, Matheus Oliveira Franca, Jacob Chan, Noa Garcia, Van Le, Ayaka Uesaka, Yuta Nakashima, Hajime Nagahara, Jueren Wang, Yutaka Fujioka
While Buddhism has spread along the Silk Roads, many pieces of art have been displaced.
no code implementations • 17 Sep 2019 • Benjamin Renoust, Matheus Oliveira Franca, Jacob Chan, Van Le, Ayaka Uesaka, Yuta Nakashima, Hajime Nagahara, Jueren Wang, Yutaka Fujioka
We introduce BUDA. ART, a system designed to assist researchers in Art History, to explore and analyze an archive of pictures of Buddha statues.
no code implementations • 24 Apr 2019 • Noa Garcia, Benjamin Renoust, Yuta Nakashima
In computer vision, visual arts are often studied from a purely aesthetics perspective, mostly by analysing the visual appearance of an artistic reproduction to infer its style, its author, or its representative features.
1 code implementation • 10 Apr 2019 • Noa Garcia, Benjamin Renoust, Yuta Nakashima
Whereas visual representations are able to capture information about the content and the style of an artwork, our proposed context-aware embeddings additionally encode relationships between different artistic attributes, such as author, school, or historical period.
2 code implementations • CVPR 2019 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä
Video summarization is a technique to create a short skim of the original video while preserving the main stories/content.
no code implementations • 7 Jul 2018 • Ryosuke Kimura, Akihiko Sayo, Fabian Lorenzo Dayrit, Yuta Nakashima, Hiroshi Kawasaki, Ambrosio Blanco, Katsushi Ikeuchi
For full-body reconstruction with loose clothes, we propose to use lower dimensional embeddings of texture and deformation referred to as eigen-texturing and eigen-deformation, to reproduce views of even unobserved surfaces.
1 code implementation • COLING 2018 • Chenhui Chu, Mayu Otani, Yuta Nakashima
These extracted VGPs have the potential to improve language and image multimodal tasks such as visual question answering and image captioning.
no code implementations • 25 Sep 2017 • Antonio Tejero-de-Pablos, Yuta Nakashima, Tomokazu Sato, Naokazu Yokoya, Marko Linna, Esa Rahtu
The labels are provided by annotators possessing different experience with respect to Kendo to demonstrate how the proposed method adapts to different needs.
2 code implementations • 28 Sep 2016 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya
For this, we design a deep neural network that maps videos as well as descriptions to a common semantic space and jointly trained it with associated pairs of videos and descriptions.
no code implementations • 8 Aug 2016 • Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya
In description generation, the performance level is comparable to the current state-of-the-art, although our embeddings were trained for the retrieval tasks.