Search Results for author: Ran Zmigrod

Found 17 papers, 9 papers with code

Efficient Sampling of Dependency Structure

1 code implementation • EMNLP 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

In this paper, we adapt two spanning tree sampling algorithms to faithfully sample dependency trees from a graph subject to the root constraint.

Paper
Code

BuDDIE: A Business Document Dataset for Multi-task Information Extraction

no code implementations • 5 Apr 2024 • Ran Zmigrod, Dongsheng Wang, Mathieu Sibue, Yulong Pei, Petr Babkin, Ivan Brugere, Xiaomo Liu, Nacho Navarro, Antony Papadimitriou, William Watson, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah

Several datasets exist for research on specific tasks of VRDU such as document classification (DC), key entity extraction (KEE), entity linking, visual question answering (VQA), inter alia.

Document Classification document understanding +5

Paper
Add Code

Translating between SQL Dialects for Cloud Migration

no code implementations • 13 Mar 2024 • Ran Zmigrod, Salwa Alamir, Xiaomo Liu

In this work, we consider the difficulties of this migration for SQL databases.

Paper
Add Code

Log Summarisation for Defect Evolution Analysis

no code implementations • 13 Mar 2024 • Rares Dolga, Ran Zmigrod, Rui Silva, Salwa Alamir, Sameena Shah

Log analysis and monitoring are essential aspects in software maintenance and identifying defects.

Paper
Add Code

TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing

no code implementations • 7 Feb 2024 • Ran Zmigrod, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah

Visually Rich Form Understanding (VRFU) poses a complex research problem due to the documents' highly structured nature and yet highly variable style and content.

Paper
Add Code

Efficient Semiring-Weighted Earley Parsing

1 code implementation • 6 Jul 2023 • Andreas Opedal, Ran Zmigrod, Tim Vieira, Ryan Cotterell, Jason Eisner

This paper provides a reference description, in the form of a deduction system, of Earley's (1970) context-free parsing algorithm with various speed-ups.

Sentence

Paper
Code

UniMorph 4.0: Universal Morphology

no code implementations • LREC 2022 • Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay, Juan López Bautista, Gema Celeste Silva Villegas, Lucas Torroba Hennigen, Adam Ek, David Guriel, Peter Dirix, Jean-Philippe Bernardy, Andrey Scherbakov, Aziyana Bayyr-ool, Antonios Anastasopoulos, Roberto Zariquiey, Karina Sheifer, Sofya Ganieva, Hilaria Cruz, Ritván Karahóǧa, Stella Markantonatou, George Pavlidis, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Candy Angulo, Jatayu Baxi, Andrew Krizhanovsky, Natalia Krizhanovskaya, Elizabeth Salesky, Clara Vania, Sardana Ivanova, Jennifer White, Rowan Hall Maudslay, Josef Valvoda, Ran Zmigrod, Paula Czarnowska, Irene Nikkarinen, Aelita Salchak, Brijesh Bhatt, Christopher Straughn, Zoey Liu, Jonathan North Washington, Yuval Pinter, Duygu Ataman, Marcin Wolinski, Totok Suhardijanto, Anna Yablonskaya, Niklas Stoehr, Hossep Dolatian, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Aryaman Arora, Richard J. Hatcher, Ritesh Kumar, Jeremiah Young, Daria Rodionova, Anastasia Yemelina, Taras Andrushko, Igor Marchenko, Polina Mashkovtseva, Alexandra Serova, Emily Prud'hommeaux, Maria Nepomniashchaya, Fausto Giunchiglia, Eleanor Chodroff, Mans Hulden, Miikka Silfverberg, Arya D. McCarthy, David Yarowsky, Ryan Cotterell, Reut Tsarfaty, Ekaterina Vylomova

The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema.

Morphological Inflection

Paper
Add Code

Exact Paired-Permutation Testing for Structured Test Statistics

1 code implementation • NAACL 2022 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

However, practitioners rely on Monte Carlo approximation to perform this test due to a lack of a suitable exact algorithm.

Paper
Code

Efficient Sampling of Dependency Structures

no code implementations • 14 Sep 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

Colbourn (1996)'s sampling algorithm has a running time of $\mathcal{O}(N^3)$, which is often greater than the mean hitting time of a directed graph.

Paper
Add Code

On Finding the K-best Non-projective Dependency Trees

1 code implementation • ACL 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

Furthermore, we present a novel extension of the algorithm for decoding the K-best dependency trees of a graph which are subject to a root constraint.

Dependency Parsing Sentence

Paper
Code

Higher-order Derivatives of Weighted Finite-state Machines

1 code implementation • ACL 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

In the case of second-order derivatives, our scheme runs in the optimal $\mathcal{O}(A^2 N^4)$ time where $A$ is the alphabet size and $N$ is the number of states.

Paper
Code

On Finding the $K$-best Non-projective Dependency Trees

1 code implementation • 1 Jun 2021 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

Furthermore, we present a novel extension of the algorithm for decoding the $K$-best dependency trees of a graph which are subject to a root constraint.

Dependency Parsing Sentence

Paper
Code

Please Mind the Root: Decoding Arborescences for Dependency Parsing

1 code implementation • EMNLP 2020 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

The connection between dependency trees and spanning trees is exploited by the NLP community to train and to decode graph-based dependency parsers.

Dependency Parsing

Paper
Code

Efficient Computation of Expectations under Spanning Tree Distributions

no code implementations • 29 Aug 2020 • Ran Zmigrod, Tim Vieira, Ryan Cotterell

We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.

Sentence

Paper
Add Code

SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection

1 code implementation • WS 2020 • Ekaterina Vylomova, Jennifer White, Elizabeth Salesky, Sabrina J. Mielke, Shijie Wu, Edoardo Ponti, Rowan Hall Maudslay, Ran Zmigrod, Josef Valvoda, Svetlana Toldova, Francis Tyers, Elena Klyachko, Ilya Yegorov, Natalia Krizhanovsky, Paula Czarnowska, Irene Nikkarinen, Andrew Krizhanovsky, Tiago Pimentel, Lucas Torroba Hennigen, Christo Kirov, Garrett Nicolai, Adina Williams, Antonios Anastasopoulos, Hilaria Cruz, Eleanor Chodroff, Ryan Cotterell, Miikka Silfverberg, Mans Hulden

Systems were developed using data from 45 languages and just 5 language families, fine-tuned with data from an additional 45 languages and 10 language families (13 in total), and evaluated on all 90 languages.

Hallucination Morphological Inflection

Paper
Code

Information-Theoretic Probing for Linguistic Structure

1 code implementation • ACL 2020 • Tiago Pimentel, Josef Valvoda, Rowan Hall Maudslay, Ran Zmigrod, Adina Williams, Ryan Cotterell

The success of neural networks on a diverse set of NLP tasks has led researchers to question how much these networks actually ``know'' about natural language.

Word Embeddings

Paper
Code

Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology

no code implementations • ACL 2019 • Ran Zmigrod, Sabrina J. Mielke, Hanna Wallach, Ryan Cotterell

Gender stereotypes are manifest in most of the world's languages and are consequently propagated or amplified by NLP systems.

counterfactual Data Augmentation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.