Search Results for author: Martin Klein

Found 8 papers, 1 papers with code

Combining keyphrase extraction and lexical diversity to characterize ideas in publication titles

no code implementations30 Aug 2022 James Powell, Martin Klein, Lyudmila Balakireva

We compare the performance of several phrase detection models, analyze the keyphrase sets output of each, and calculate lexical diversity of corpora variants incorporating keyphrases from each model, using four common lexical diversity metrics.

Keyphrase Extraction

Human-in-the-Loop Refinement of Word Embeddings

no code implementations6 Oct 2021 James Powell, Kari Sentz, Martin Klein

Word embeddings are a fixed, distributional representation of the context of words in a corpus learned from word co-occurrences.

Word Embeddings

Automatically Selecting Striking Images for Social Cards

no code implementations8 Mar 2021 Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson

With these observations, we are motivated to quantify the levels of inclusion of required metadata in web resources, its evolution over time for archived resources, and create and evaluate an algorithm to automatically select a striking image for social cards.

Digital Libraries Human-Computer Interaction

MementoEmbed and Raintale for Web Archive Storytelling

no code implementations1 Aug 2020 Shawn M. Jones, Martin Klein, Michele C. Weigle, Michael L. Nelson

Search engines and social media platforms often represent web pages as cards consisting of text snippets, titles, and images.

Evaluating Memento Service Optimizations

no code implementations31 May 2019 Martin Klein, Lyudmila Balakireva, Harihar Shankar

Services and applications based on the Memento Aggregator can suffer from slow response times due to the federated search across web archives performed by the Memento infrastructure.

BIG-bench Machine Learning

Collecting 16K archived web pages from 17 public web archives

1 code implementation9 May 2019 Mohamed Aturban, Michael L. Nelson, Michele C. Weigle, Martin Klein, Herbert Van de Sompel

First, we used the Los Alamos National Laboratory (LANL) Memento Aggregator to collect mementos of an initial set of URIs obtained from four sources: (a) the Moz Top 500, (b) the dataset used in our previous study, (c) the HTTP Archive, and (d) the Web Archives for Historical Research group.

Digital Libraries

Approximating Document Frequency with Term Count Values

no code implementations23 Jul 2008 Martin Klein, Michael L. Nelson

Intuitively this value is different from document frequency (DF), the number of documents (e. g., web pages) a certain term occurs in.

Information Retrieval Digital Libraries H.3.0

Cannot find the paper you are looking for? You can Submit a new open access paper.