no code implementations • LREC 2022 • Christopher Cieri, Mark Liberman, Sunghye Cho, Stephanie Strassel, James Fiumara, Jonathan Wright
The Linguistic Data Consortium was founded in 1992 to solve the problem that limitations in access to shareable data was impeding progress in Human Language Technology research and development.
no code implementations • NIDCP (LREC) 2022 • James Fiumara, Christopher Cieri, Mark Liberman, Chris Callison-Burch, Jonathan Wright, Robert Parker
NIEUW leverages the power of novel incentives to elicit linguistic data and annotations from a wide variety of contributors including citizen scientists, game players, and language students and professionals.
no code implementations • NIDCP (LREC) 2022 • Juhong Zhan, Yue Jiang, Christopher Cieri, Mark Liberman, Jiahong Yuan, Yiya Chen, Odette Scharenborg
This paper describes our use of mixed incentives and the citizen science portal LanguageARC to prepare, collect and quality control a large corpus of object namings for the purpose of providing speech data to document the under-represented Guanzhong dialect of Chinese spoken in the Shaanxi province in the environs of Xi’an.
no code implementations • NAACL (CLPsych) 2022 • Sunghye Cho, Riccardo Fusaroli, Maggie Rose Pelella, Kimberly Tena, Azia Knox, Aili Hauptmann, Maxine Covello, Alison Russell, Judith Miller, Alison Hulink, Jennifer Uzokwe, Kevin Walker, James Fiumara, Juhi Pandey, Christopher Chatham, Christopher Cieri, Robert Schultz, Mark Liberman, Julia Parish-Morris
This study examined differences in linguistic features produced by autistic and neurotypical (NT) children during brief picture descriptions, and assessed feature stability over time.
3 code implementations • 2 Dec 2020 • Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman
DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain.
no code implementations • LREC 2020 • Christopher Cieri, James Fiumara
LanguageARC is a portal that offers citizen linguists opportunities to contribute to language related research.
no code implementations • LREC 2020 • Christopher Cieri, James Fiumara, Stephanie Strassel, Jonathan Wright, Denise DiPersio, Mark Liberman
This latest in a series of Linguistic Data Consortium (LDC) progress reports to the LREC community does not describe any single language resource, evaluation campaign or technology but sketches the activities, since the last report, of a data center devoted to supporting the work of LREC attendees among other research communities.
no code implementations • LREC 2020 • Daniel Jaquette, Christopher Cieri, Denise DiPersio
The authors go step-by-step through the development of the Related Works schema, implementation of the software and database changes, and data entry of the relations.
no code implementations • LREC 2020 • James Fiumara, Christopher Cieri, Jonathan Wright, Mark Liberman
Like other Citizen Science platforms and projects, LanguageARC harnesses the power and efforts of volunteers who are motivated by the incentives of contributing to science, learning and discovery, and belonging to a community dedicated to social improvement.
no code implementations • LREC 2020 • Christopher Cieri
Given the persistent gap between demand and supply, the impetus to reuse language resources is great.
1 code implementation • 18 Jun 2019 • Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristia, Jun Du, Sriram Ganapathy, Mark Liberman
This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain.
no code implementations • LREC 2018 • Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis, Takenobu Tokunaga, Sara Goggi, H{\'e}l{\`e}ne Mazo
no code implementations • LREC 2016 • Denise DiPersio, Christopher Cieri, Daniel Jaquette
Data management plans, data sharing plans and the like are now required by funders worldwide as part of research proposals.
no code implementations • LREC 2016 • Christopher Cieri, Mike Maxwell, Stephanie Strassel, Jennifer Tracey
This paper documents and describes the criteria used to select languages for study within programs that include low resource languages whether given that label or another similar one.
no code implementations • LREC 2016 • Nancy Ide, Keith Suderman, James Pustejovsky, Marc Verhagen, Christopher Cieri
The NSF-SI2-funded LAPPS Grid project is a collaborative effort among Brandeis University, Vassar College, Carnegie-Mellon University (CMU), and the Linguistic Data Consortium (LDC), which has developed an open, web-based infrastructure through which resources can be easily accessed and within which tailored language services can be efficiently composed, evaluated, disseminated and consumed by researchers, developers, and students across a wide variety of disciplines.
no code implementations • LREC 2016 • Denise DiPersio, Christopher Cieri
Since its inception in 2010, the Linguistic Data Consortium{'}s data scholarship program has awarded no cost grants in data to 64 recipients from 26 countries.
no code implementations • LREC 2016 • Julia Parish-Morris, Christopher Cieri, Mark Liberman, Leila Bateman, Emily Ferguson, Robert T. Schultz
Autism spectrum disorder (ASD) is a complex neurodevelopmental condition that would benefit from low-cost and reliable improvements to screening and diagnosis.
no code implementations • LREC 2014 • Joseph Mariani, Christopher Cieri, Gil Francopoulo, Patrick Paroubek, Marine Delaborde
This paper describes the problems that must be addressed when studying large amounts of data over time which require entity normalization applied not to the usual genres of news or political speech, but to the genre of academic discourse about language resources, technologies and sciences.
no code implementations • LREC 2014 • Penny Labropoulou, Christopher Cieri, Maria Gavrilidou
However, the scope of the paper is limited to relations holding for datasets and tools.
no code implementations • LREC 2014 • Nancy Ide, James Pustejovsky, Christopher Cieri, Eric Nyberg, Di Wang, Keith Suderman, Marc Verhagen, Jonathan Wright
The Language Application (LAPPS) Grid project is establishing a framework that enables language service discovery, composition, and reuse and promotes sustainability, manageability, usability, and interoperability of natural language Processing (NLP) components.
no code implementations • LREC 2014 • Christopher Cieri, Denise DiPersio, Mark Liberman, Andrea Mazzucchi, Stephanie Strassel, Jonathan Wright
Despite the growth in the number of linguistic data centers around the world, their accomplishments and expansions and the advances they have help enable, the language resources that exist are a small fraction of those required to meet the goals of Human Language Technologies (HLT) for the worldÂ’s languages and the promises they offer: broad access to knowledge, direct communication across language boundaries and engagement in a global community.
no code implementations • LREC 2012 • Eleftheria Ahtaridis, Christopher Cieri, Denise DiPersio
The Linguistic Data Consortium (LDC) creates and provides language resources (LRs) including data, tools and specifications.
no code implementations • LREC 2012 • Christopher Cieri, Marian Reed, Denise DiPersio, Mark Liberman
On the Linguistic Data Consortium's (LDC) 20th anniversary, this paper describes the changes to the language resource landscape over the past two decades, how LDC has adjusted its practice to adapt to them and how the business model continues to grow.