Search Results for author: Haiyue Song

Found 20 papers, 7 papers with code

IteRABRe: Iterative Recovery-Aided Block Reduction

no code implementations8 Mar 2025 Haryo Akbarianto Wibowo, Haiyue Song, Hideki Tanaka, Masao Utiyama, Alham Fikri Aji, Raj Dabre

Large Language Models (LLMs) have grown increasingly expensive to deploy, driving the need for effective model compression techniques.

Model Compression

Pralekha: An Indic Document Alignment Evaluation Benchmark

1 code implementation28 Nov 2024 Sanjay Suryanarayanan, Haiyue Song, Mohammed Safi Ur Rahman Khan, Anoop Kunchukuttan, Mitesh M. Khapra, Raj Dabre

To address the challenge of aligning documents using sentence and chunk-level alignments, we propose a novel scoring method, Document Alignment Coefficient (DAC).

Sentence Sentence Embedding +1

Connecting Ideas in 'Lower-Resource' Scenarios: NLP for National Varieties, Creoles and Other Low-resource Scenarios

no code implementations19 Sep 2024 Aditya Joshi, Diptesh Kanojia, Heather Lent, Hour Kaing, Haiyue Song

Despite excellent results on benchmarks over a small subset of languages, large language models struggle to process text from languages situated in `lower-resource' scenarios such as dialects/sociolects (national or social varieties of a language), Creoles (languages arising from linguistic contact between multiple languages) and other low-resource languages.

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

no code implementations10 Jun 2024 David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja, Bontu Fufa Balcha, Chenxi Whitehouse, Christian Salamea, Dan John Velasco, David Ifeoluwa Adelani, David Le Meur, Emilio Villa-Cueva, Fajri Koto, Fauzan Farooqui, Frederico Belcavello, Ganzorig Batnasan, Gisela Vallejo, Grainne Caulfield, Guido Ivetta, Haiyue Song, Henok Biadglign Ademtew, Hernán Maina, Holy Lovenia, Israel Abebe Azime, Jan Christian Blaise Cruz, Jay Gala, Jiahui Geng, Jesus-German Ortiz-Barajas, Jinheon Baek, Jocelyn Dunstan, Laura Alonso Alemany, Kumaranage Ravindu Yasas Nagasinghe, Luciana Benotti, Luis Fernando D'Haro, Marcelo Viridiano, Marcos Estecha-Garitagoitia, Maria Camila Buitrago Cabrera, Mario Rodríguez-Cantelar, Mélanie Jouitteau, Mihail Mihaylov, Mohamed Fazli Mohamed Imam, Muhammad Farid Adilazuarda, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Naome Etori, Olivier Niyomugisha, Paula Mónica Silva, Pranjal Chitale, Raj Dabre, Rendi Chevi, Ruochen Zhang, Ryandito Diandaru, Samuel Cahyawijaya, Santiago Góngora, Soyeong Jeong, Sukannya Purkayastha, Tatsuki Kuribayashi, Teresa Clifford, Thanmay Jayakumar, Tiago Timponi Torrent, Toqeer Ehsan, Vladimir Araujo, Yova Kementchedjhieva, Zara Burzo, Zheng Wei Lim, Zheng Xin Yong, Oana Ignat, Joan Nwatu, Rada Mihalcea, Thamar Solorio, Alham Fikri Aji

Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data.

Diversity Question Answering +1

Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks

1 code implementation11 Jan 2024 Yahui Fu, Haiyue Song, Tianyu Zhao, Tatsuya Kawahara

Personality recognition is useful for enhancing robots' ability to tailor user-adaptive responses, thus fostering rich human-robot interactions.

Data Augmentation

Bilingual Corpus Mining and Multistage Fine-Tuning for Improving Machine Translation of Lecture Transcripts

1 code implementation7 Nov 2023 Haiyue Song, Raj Dabre, Chenhui Chu, Atsushi Fujita, Sadao Kurohashi

To create the parallel corpora, we propose a dynamic programming based sentence alignment algorithm which leverages the cosine similarity of machine-translated sentences.

Benchmarking Machine Translation +3

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

no code implementations17 May 2023 Zhuoyuan Mao, Haiyue Song, Raj Dabre, Chenhui Chu, Sadao Kurohashi

The language-independency of encoded representations within multilingual neural machine translation (MNMT) models is crucial for their generalization ability on zero-shot translation.

Machine Translation Translation

GPT-RE: In-context Learning for Relation Extraction using Large Language Models

1 code implementation3 May 2023 Zhen Wan, Fei Cheng, Zhuoyuan Mao, Qianying Liu, Haiyue Song, Jiwei Li, Sadao Kurohashi

In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e. g., GPT-3), they still lag significantly behind fully-supervised baselines (e. g., fine-tuned BERT) in relation extraction (RE).

In-Context Learning Relation +2

Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision

no code implementations18 May 2022 Zhen Wan, Fei Cheng, Qianying Liu, Zhuoyuan Mao, Haiyue Song, Sadao Kurohashi

Contrastive pre-training on distant supervision has shown remarkable effectiveness in improving supervised relation extraction tasks.

Contrastive Learning Relation +1

When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?

no code implementations Findings (NAACL) 2022 Zhuoyuan Mao, Chenhui Chu, Raj Dabre, Haiyue Song, Zhen Wan, Sadao Kurohashi

Meanwhile, the contrastive objective can implicitly utilize automatically learned word alignment, which has not been explored in many-to-many NMT.

Machine Translation NMT +5

Video-guided Machine Translation with Spatial Hierarchical Attention Network

no code implementations ACL 2021 Weiqi Gu, Haiyue Song, Chenhui Chu, Sadao Kurohashi

Video-guided machine translation, as one type of multimodal machine translations, aims to engage video contents as auxiliary information to address the word sense ambiguity problem in machine translation.

Action Detection Machine Translation +2

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

1 code implementation LREC 2020 Zhuoyuan Mao, Fabien Cromieres, Raj Dabre, Haiyue Song, Sadao Kurohashi

Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora.

Low Resource NMT NMT +2

Pre-training via Leveraging Assisting Languages and Data Selection for Neural Machine Translation

no code implementations23 Jan 2020 Haiyue Song, Raj Dabre, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi, Eiichiro Sumita

To this end, we propose to exploit monolingual corpora of other languages to complement the scarcity of monolingual corpora for the LOI.

Machine Translation NMT +1

Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation

1 code implementation LREC 2020 Haiyue Song, Raj Dabre, Atsushi Fujita, Sadao Kurohashi

To address this, we examine a language independent framework for parallel corpus mining which is a quick and effective way to mine a parallel corpus from publicly available lectures at Coursera.

Benchmarking Domain Adaptation +4

Invocation-driven Neural Approximate Computing with a Multiclass-Classifier and Multiple Approximators

1 code implementation19 Oct 2018 Haiyue Song, Chengwen Xu, Qiang Xu, Zhuoran Song, Naifeng Jing, Xiaoyao Liang, Li Jiang

We thus propose a novel approximate computing architecture with a Multiclass-Classifier and Multiple Approximators (MCMA).

Cannot find the paper you are looking for? You can Submit a new open access paper.