no code implementations • LREC 2022 • Jennifer Tracey, Ann Bies, Jeremy Getman, Kira Griffitt, Stephanie Strassel
This paper describes data resources created for Phase 1 of the DARPA Active Interpretation of Disparate Alternatives (AIDA) program, which aims to develop language technology that can help humans manage large volumes of sometimes conflicting information to develop a comprehensive understanding of events around the world, even when such events are described in multiple media and languages.
no code implementations • LREC 2020 • Justin Mott, Ann Bies, Stephanie Strassel, Jordan Kodner, Caitlin Richter, Hongzhi Xu, Mitchell Marcus
This paper describes a new morphology resource created by Linguistic Data Consortium and the University of Pennsylvania for the DARPA LORELEI Program.
no code implementations • LREC 2016 • Xuansong Li, Martha Palmer, Nianwen Xue, Lance Ramshaw, Mohamed Maamouri, Ann Bies, Kathryn Conger, Stephen Grimes, Stephanie Strassel
High accuracy for automated translation and information retrieval calls for linguistic annotations at various language levels.
no code implementations • LREC 2016 • Seth Kulick, Ann Bies
The Low Resource Language research conducted under DARPA{'}s Broad Operational Language Translation (BOLT) program required the rapid creation of text corpora of typologically diverse languages (Turkish, Hausa, and Uzbek) which were annotated with morphological information, along with other types of annotation.
no code implementations • LREC 2016 • Justin Mott, Ann Bies, Zhiyi Song, Stephanie Strassel
This paper introduces the parallel Chinese-English Entities, Relations and Events (ERE) corpora developed by Linguistic Data Consortium under the DARPA Deep Exploration and Filtering of Text (DEFT) Program.
no code implementations • WS 2014 • Ann Bies, Zhiyi Song, Mohamed Maamouri, Stephen Grimes, Haejoong Lee, Jonathan Wright, Stephanie Strassel, Nizar Habash, Esk, Ramy er, Owen Rambow
no code implementations • LREC 2014 • Ann Bies, Justin Mott, Seth Kulick, Jennifer Garland, Colin Warner
New annotation guidelines and new processing methods were developed to accommodate English treebank annotation of a parallel English/Chinese corpus of web data that includes alternate English translations (one fluent, one literal) of expressions that are idiomatic in the Chinese source.
no code implementations • LREC 2014 • Mohamed Maamouri, Ann Bies, Seth Kulick, Michael Ciul, Nizar Habash, Esk, Ramy er
This paper describes the parallel development of an Egyptian Arabic Treebank and a morphological analyzer for Egyptian Arabic (CALIMA).
no code implementations • LREC 2012 • Mohamed Maamouri, Ann Bies, Seth Kulick
Because news broadcasts are predominantly scripted, most of the transcribed speech is in Modern Standard Arabic (MSA).
no code implementations • LREC 2012 • Seth Kulick, Ann Bies, Justin Mott
Annotation of word sequences are compared both for their internal structural consistency, and their external relation to the rest of the tree.
no code implementations • LREC 2012 • Xuansong Li, Stephanie Strassel, Stephen Grimes, Safa Ismael, Mohamed Maamouri, Ann Bies, Nianwen Xue
Parallel aligned treebanks (PAT) are linguistic corpora annotated with morphological and syntactic structures that are aligned at sentence as well as sub-sentence levels.