Search Results for author: Yvette Graham

Found 51 papers, 8 papers with code

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations WMT (EMNLP) 2021 Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

The Third Multilingual Surface Realisation Shared Task (SR’20): Overview and Evaluation Results

1 code implementation MSR (COLING) 2020 Simon Mille, Anya Belz, Bernd Bohnet, Thiago castro Ferreira, Yvette Graham, Leo Wanner

As in SR’18 and SR’19, the shared task comprised two tracks: (1) a Shallow Track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (2) a Deep Track where additionally, functional words and morphological information were removed.

An overview on the evaluated video retrieval tasks at TRECVID 2022

no code implementations22 Jun 2023 George Awad, Keith Curtis, Asad Butt, Jonathan Fiscus, Afzal Godil, Yooyoung Lee, Andrew Delgado, Eliot Godard, Lukas Diduch, Jeffrey Liu, Yvette Graham, Georges Quenot

The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, tasks-based evaluation supported by metrology.

Ad-hoc video search Retrieval +2

Is a Video worth $n\times n$ Images? A Highly Efficient Approach to Transformer-based Video Question Answering

no code implementations16 May 2023 Chenyang Lyu, Tianbo Ji, Yvette Graham, Jennifer Foster

We show that by integrating our approach into VideoQA systems we can achieve comparable, even superior, performance with a significant speed up for training and inference.

Question Answering Video Question Answering

Semantic-aware Dynamic Retrospective-Prospective Reasoning for Event-level Video Question Answering

no code implementations14 May 2023 Chenyang Lyu, Tianbo Ji, Yvette Graham, Jennifer Foster

Specifically, we explicitly use the Semantic Role Labeling (SRL) structure of the question in the dynamic reasoning process where we decide to move to the next frame based on which part of the SRL structure (agent, verb, patient, etc.)

Question Answering Semantic Role Labeling +1

QAScore -- An Unsupervised Unreferenced Metric for the Question Generation Evaluation

no code implementations9 Oct 2022 Tianbo Ji, Chenyang Lyu, Gareth Jones, Liting Zhou, Yvette Graham

Question Generation (QG) aims to automate the task of composing questions for a passage with a set of chosen answers found within the passage.

Language Modelling Question Generation +1

Extending the Scope of Out-of-Domain: Examining QA models in multiple subdomains

1 code implementation insights (ACL) 2022 Chenyang Lyu, Jennifer Foster, Yvette Graham

Past works that investigate out-of-domain performance of QA systems have mainly focused on general domains (e. g. news domain, wikipedia domain), underestimating the importance of subdomains defined by the internal characteristics of QA datasets.

Position

Achieving Reliable Human Assessment of Open-Domain Dialogue Systems

1 code implementation ACL 2022 Tianbo Ji, Yvette Graham, Gareth J. F. Jones, Chenyang Lyu, Qun Liu

Answering the distress call of competitions that have emphasized the urgent need for better evaluation techniques in dialogue, we present the successful development of human evaluation that is highly reliable while still remaining feasible and low cost.

Dialogue Evaluation

Improving Unsupervised Question Answering via Summarization-Informed Question Generation

no code implementations EMNLP 2021 Chenyang Lyu, Lifeng Shang, Yvette Graham, Jennifer Foster, Xin Jiang, Qun Liu

Template-based QG uses linguistically-informed heuristics to transform declarative sentences into interrogatives, whereas supervised QG uses existing Question Answering (QA) datasets to train a system to generate a question given a passage and an answer.

Dependency Parsing named-entity-recognition +8

Improving Document-Level Sentiment Analysis with User and Product Context

1 code implementation COLING 2020 Chenyang Lyu, Jennifer Foster, Yvette Graham

We achieve this by explicitly storing representations of reviews written by the same user and about the same product and force the model to memorize all reviews for one particular user and product.

Sentiment Analysis

TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval

no code implementations21 Sep 2020 George Awad, Asad A. Butt, Keith Curtis, Yooyoung Lee, Jonathan Fiscus, Afzal Godil, Andrew Delgado, Jesse Zhang, Eliot Godard, Lukas Diduch, Alan F. Smeaton, Yvette Graham, Wessel Kraaij, Georges Quenot

The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation.

Action Detection Activity Detection +5

The Second Multilingual Surface Realisation Shared Task (SR'19): Overview and Evaluation Results

no code implementations WS 2019 Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Leo Wanner

We report results from the SR{'}19 Shared Task, the second edition of a multilingual surface realisation task organised as part of the EMNLP{'}19 Workshop on Multilingual Surface Realisation.

Translationese in Machine Translation Evaluation

no code implementations24 Jun 2019 Yvette Graham, Barry Haddow, Philipp Koehn

Finally, we provide a comprehensive check-list for future machine translation evaluation.

Machine Translation Translation

Results of the WMT18 Metrics Shared Task: Both characters and embeddings achieve good performance

no code implementations WS 2018 Qingsong Ma, Ond{\v{r}}ej Bojar, Yvette Graham

We asked participants of this task to score the outputs of the MT systems involved in the WMT18 News Translation Task with automatic metrics.

Machine Translation Sentence +1

The First Multilingual Surface Realisation Shared Task (SR'18): Overview and Evaluation Results

no code implementations WS 2018 Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Emily Pitler, Leo Wanner

We report results from the SR{'}18 Shared Task, a new multilingual surface realisation task organised as part of the ACL{'}18 Workshop on Multilingual Surface Realisation.

Translating Pro-Drop Languages with Reconstruction Models

1 code implementation10 Jan 2018 Long-Yue Wang, Zhaopeng Tu, Shuming Shi, Tong Zhang, Yvette Graham, Qun Liu

Next, the annotated source sentence is reconstructed from hidden representations in the NMT model.

Machine Translation NMT +2

Evaluation of Automatic Video Captioning Using Direct Assessment

no code implementations29 Oct 2017 Yvette Graham, George Awad, Alan Smeaton

We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video.

Machine Translation Translation +1

Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation

1 code implementation EMNLP 2017 Qingsong Ma, Yvette Graham, Timothy Baldwin, Qun Liu

Monolingual evaluation of Machine Translation (MT) aims to simplify human assessment by requiring assessors to compare the meaning of the MT output with a reference translation, opening up the task to a much larger pool of genuinely qualified evaluators.

Machine Translation Translation

Improving Evaluation of Document-level Machine Translation Quality Estimation

no code implementations EACL 2017 Yvette Graham, Qingsong Ma, Timothy Baldwin, Qun Liu, Carla Parra, Carolina Scarton

Meaningful conclusions about the relative performance of NLP systems are only possible if the gold standard employed in a given evaluation is both valid and reliable.

Document Level Machine Translation Machine Translation +2

Is all that Glitters in Machine Translation Quality Estimation really Gold?

no code implementations COLING 2016 Yvette Graham, Timothy Baldwin, Meghan Dowling, Maria Eskevich, Teresa Lynn, Lamia Tounsi

Human-targeted metrics provide a compromise between human evaluation of machine translation, where high inter-annotator agreement is difficult to achieve, and fully automatic metrics, such as BLEU or TER, that lack the validity of human assessment.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.