no code implementations • 30 Aug 2022 • James Powell, Martin Klein, Lyudmila Balakireva
We compare the performance of several phrase detection models, analyze the keyphrase sets output of each, and calculate lexical diversity of corpora variants incorporating keyphrases from each model, using four common lexical diversity metrics.
no code implementations • 6 Oct 2021 • James Powell, Kari Sentz, Martin Klein
Word embeddings are a fixed, distributional representation of the context of words in a corpus learned from word co-occurrences.
no code implementations • 8 Mar 2021 • Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson
With these observations, we are motivated to quantify the levels of inclusion of required metadata in web resources, its evolution over time for archived resources, and create and evaluate an algorithm to automatically select a striking image for social cards.
Digital Libraries Human-Computer Interaction
no code implementations • 1 Aug 2020 • Shawn M. Jones, Martin Klein, Michele C. Weigle, Michael L. Nelson
Search engines and social media platforms often represent web pages as cards consisting of text snippets, titles, and images.
no code implementations • 1 Aug 2020 • Shawn M. Jones, Alexander C. Nwala, Martin Klein, Michele C. Weigle, Michael L. Nelson
StoryGraph clusters news articles together to identify a common news story.
no code implementations • 31 May 2019 • Martin Klein, Lyudmila Balakireva, Harihar Shankar
Services and applications based on the Memento Aggregator can suffer from slow response times due to the federated search across web archives performed by the Memento infrastructure.
1 code implementation • 9 May 2019 • Mohamed Aturban, Michael L. Nelson, Michele C. Weigle, Martin Klein, Herbert Van de Sompel
First, we used the Los Alamos National Laboratory (LANL) Memento Aggregator to collect mementos of an initial set of URIs obtained from four sources: (a) the Moz Top 500, (b) the dataset used in our previous study, (c) the HTTP Archive, and (d) the Web Archives for Historical Research group.
Digital Libraries
no code implementations • 23 Jul 2008 • Martin Klein, Michael L. Nelson
Intuitively this value is different from document frequency (DF), the number of documents (e. g., web pages) a certain term occurs in.
Information Retrieval Digital Libraries H.3.0