Previous work is mostly based on statistical methods that estimate word-level salience, which does not consider semantics and larger context when quantifying importance.
We propose a modeling framework for event data, which excels in small data regime with the ability to incorporate domain knowledge.
Time-fluctuating signals are ubiquitous and diverse in many physical, chemical, and biological systems, among which random telegraph signals (RTSs) refer to a series of instantaneous switching events between two discrete levels from single-particle movements.
We study unsupervised multi-hop reranking for multi-hop QA (MQA) with open-domain questions.
Ideology is at the core of political science research.
NLP-powered automatic question generation (QG) techniques carry great pedagogical potential of saving educators' time and benefiting student learning.
We revisit ideas presented by Lugosch et al. using speech pre-training and three-module modeling; however, to ease construction of the end-to-end SLU model, we use as our phoneme module an open-source acoustic-phonetic model from a DNN-HMM hybrid automatic speech recognition (ASR) system instead of training one from scratch.
Combined with transfer learning, substantial F1 score boost (5-25) can be further achieved during the early iterations of active learning across domains.
In this work, we present HIBRIDS, which injects Hierarchical Biases foR Incorporating Document Structure into the calculation of attention scores.
Cross-modal hashing still has some challenges needed to address: (1) most existing CMH methods take graphs as input to model data distribution.
Relative radiometric normalization(RRN) of different satellite images of the same terrain is necessary for change detection, object classification/segmentation, and map-making tasks.
no code implementations • 21 Oct 2021 • Imanol Luengo, Maria Grammatikopoulou, Rahim Mohammadi, Chris Walsh, Chinedu Innocent Nwoye, Deepak Alapatt, Nicolas Padoy, Zhen-Liang Ni, Chen-Chen Fan, Gui-Bin Bian, Zeng-Guang Hou, Heonjin Ha, Jiacheng Wang, Haojie Wang, Dong Guo, Lu Wang, Guotai Wang, Mobarakol Islam, Bharat Giddwani, Ren Hongliang, Theodoros Pissas, Claudio Ravasio, Martin Huber, Jeremy Birch, Joan M. Nunez Do Rio, Lyndon Da Cruz, Christos Bergeles, Hongyu Chen, Fucang Jia, Nikhil KumarTomar, Debesh Jha, Michael A. Riegler, Pal Halvorsen, Sophia Bano, Uddhav Vaghela, Jianyuan Hong, Haili Ye, Feihong Huang, Da-Han Wang, Danail Stoyanov
In 2020, we released pixel-wise semantic annotations for anatomy and instruments for 4670 images sampled from 25 videos of the CATARACTS training set.
no code implementations • 1 Oct 2021 • Yan Xia, Linhui Jiang, Lu Wang, Xue Chen, Jianjie Ye, Tangyan Hou, Liqiang Wang, Yibo Zhang, Mengying Li, Zhen Li, Zhe Song, Yaping Jiang, Weiping Liu, Pengfei Li, Daniel Rosenfeld, John H. Seinfeld, Shaocai Yu
Our results show that the ORRS measurements, assisted by the machine-learning-based ensemble model developed here, can realize day-to-day supervision of on-road vehicle-specific emissions.
We propose a principled method to learn a set of human-readable logic rules to explain temporal point processes.
We study generating abstractive summaries that are faithful and factually consistent with the given articles.
We study controllable text summarization which allows users to gain control on a particular attribute (e. g., length limit) of the generated summaries.
With the increasing scale of search engine marketing, designing an efficient bidding system is becoming paramount for the success of e-commerce companies.
In this paper, we model the propagation of the COVID-19 as spatio-temporal point processes and propose a generative and intensity-free model to track the spread of the disease.
We address the problem of unsupervised localization of key-steps and feature learning in instructional videos using both visual and language instructions.
We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.
To enrich the generation with diverse content, we further propose to use large pre-trained models to predict relevant concepts and to generate claims.
Secondly, on top of the proposed graph transformer, we introduce a two-stream encoder that separately extracts representations from temporal neighborhoods associated with the two interaction nodes and then utilizes a co-attentional transformer to model inter-dependencies at a semantic level.
This paper first introduces a novel method to generate anomalous data by breaking up global structures while preserving local structures of normal data at multiple levels.
Using attention head masking, we are able to reveal the relation between encoder-decoder attentions and content selection behaviors of summarization models.
The quadratic computational and memory complexities of large Transformers have limited their scalability for long document summarization.
In this paper, we developed a new ensemble machine learning Python package based on multi-task learning (MTL), referred to as the Med-Multi-Task Learning (MD-MTL) package and applied it in predicting disease scores of patients, and in carrying out risk factor analysis on multiple subgroups of patients simultaneously.
In this paper, we propose an annotation-efficient learning framework for segmentation tasks that avoids annotations of training images, where we use an improved Cycle-Consistent Generative Adversarial Network (GAN) to learn from a set of unpaired medical images and auxiliary masks obtained either from a shape model or public datasets.
Industrial Internet of Things (IoT) enables distributed intelligent services varying with the dynamic and realtime industrial devices to achieve Industry 4. 0 benefits.
In this work, we present a novel content-controlled text generation framework, PAIR, with planning and iterative refinement, which is built upon a large model, BART.
Understanding discourse structures of news articles is vital to effectively contextualize the occurrence of a news event.
Trending topics in social media content evolve over time, and it is therefore crucial to understand social media users and their interpersonal communications in a dynamic manner.
Fall event detection, as one of the greatest risks to the elderly, has been a hot research issue in the solitary scene in recent years.
We therefore propose a novel model, XREF, that leverages attention mechanisms to (1) pinpoint relevant context within comments, and (2) detect supporting entities from the news article.
Metric learning is an important family of algorithms for classification and similarity search, but the robustness of learned metrics against small adversarial perturbations is less studied.
With the help of this technology, doctors can significantly reduce exposure frequency and intensity of the X-ray during coronary angiography.
By constraining adversarial perturbations in a low-dimensional subspace via spanning an auxiliary unlabeled dataset, the spanning attack significantly improves the query efficiency of a wide variety of existing black-box attacks.
Sequence-to-sequence models for abstractive summarization have been studied extensively, yet the generated summaries commonly suffer from fabricated content, and are often found to be near-extractive.
The reconstruction of three-dimensional models of coronary arteries is of great significance for the localization, evaluation and diagnosis of stenosis and plaque in the arteries, as well as for the assisted navigation of interventional surgery.
This article proposes a new video segmentation framework that can extract the clearest and most comprehensive coronary angiography images from a video sequence, thereby helping physicians to better observe the condition of blood vessels.
no code implementations • 23 Mar 2020 • Tobias Ross, Annika Reinke, Peter M. Full, Martin Wagner, Hannes Kenngott, Martin Apitz, Hellena Hempe, Diana Mindroc Filimon, Patrick Scholz, Thuy Nuong Tran, Pierangela Bruno, Pablo Arbeláez, Gui-Bin Bian, Sebastian Bodenstedt, Jon Lindström Bolmgren, Laura Bravo-Sánchez, Hua-Bin Chen, Cristina González, Dong Guo, Pål Halvorsen, Pheng-Ann Heng, Enes Hosgor, Zeng-Guang Hou, Fabian Isensee, Debesh Jha, Tingting Jiang, Yueming Jin, Kadir Kirtac, Sabrina Kletz, Stefan Leger, Zhixuan Li, Klaus H. Maier-Hein, Zhen-Liang Ni, Michael A. Riegler, Klaus Schoeffmann, Ruohua Shi, Stefanie Speidel, Michael Stenzel, Isabell Twick, Gutai Wang, Jiacheng Wang, Liansheng Wang, Lu Wang, Yu-Jie Zhang, Yan-Jie Zhou, Lei Zhu, Manuel Wiesenfarth, Annette Kopp-Schneider, Beat P. Müller-Stich, Lena Maier-Hein
The validation of the competing methods for the three tasks (binary segmentation, multi-instance detection and multi-instance segmentation) was performed in three different stages with an increasing domain gap between the training and the test data.
Due to the superiority in similarity computation and database storage for large-scale multiple modalities data, cross-modal hashing methods have attracted extensive attention in similarity retrieval across the heterogeneous modalities.
Large-scale cross-modal hashing similarity retrieval has attracted more and more attention in modern search applications such as search engines and autopilot, showing great superiority in computation and storage.
Graph representation learning, aiming to learn low-dimensional representations which capture the geometric dependencies between nodes in the original graph, has gained increasing popularity in a variety of graph analysis tasks, including node classification and link prediction.
The increasing prevalence of political bias in news media calls for greater public awareness of it, as well as robust methods for its detection.
Human judges further rate our system summaries as more informative and coherent than those by popular summarization models.
Building effective text generation systems requires three critical components: content selection, text planning, and surface realization, and traditionally they are tackled as separate problems.
Furthermore, we show that dual solutions for these QP problems could give us a valid lower bound of the adversarial perturbation that can be used for formal robustness verification, giving us a nice view of attack/verification for NN models.
To address this problem, we propose a reinforcement learning (RL) approach for keyphrase generation, with an adaptive reward function that encourages a model to generate both sufficient and accurate keyphrases.
We hypothesize that both the context of the ongoing conversations and the users' previous chatting history will affect their continued interests in future engagement.
Semantic parsing aims to transform natural language (NL) utterances into formal meaning representations (MRs), whereas an NL generator achieves the reverse: producing a NL description for some given MRs.
Recent research about margin theory has proved that maximizing the minimum margin like support vector machines does not necessarily lead to better performance, and instead, it is crucial to optimize the margin distribution.
Driven by Convolutional Neural Networks, object detection and semantic segmentation have gained significant improvements.
We utilize a convex margin distribution loss function on the deep neural networks to validate our theoretical results by optimizing the margin ratio.
We get appealing results in both tasks, which shows the independence prior is useful for instance segmentation and it is possible to unsupervisedly learn instance masks with only one image.
Sequence-to-sequence (seq2seq) neural models have been actively investigated for abstractive summarization.
In this paper, we propose a Graph-Sequence-to-Sequence(GraphSeq2Seq) model to fuse the dependency graph among words into the traditional Seq2Seq framework.
Question-Answer (QA) matching is a fundamental task in the Natural Language Processing community.
Prior relevant studies recommend treatments either use supervised learning (e. g. matching the indicator signal which denotes doctor prescriptions), or reinforcement learning (e. g. maximizing evaluation signal which indicates cumulative reward from survival rates).
We propose a statistical model that jointly captures: (1) topics for representing user interests and conversation content, and (2) discourse modes for describing user replying behavior and conversation dynamics.
Using a dataset of 118 Oxford-style debates, our model's combination of content (as latent topics) and style (as linguistic features) allows us to predict audience-adjudicated winners with 74% accuracy, significantly outperforming linguistic features alone (66%).
Our session-based models outperform the state-of-the-art method for entity extraction task in SDS.
We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization.
We present a token-level decision summarization framework that utilizes the latent topic structures of utterances to identify "summary-worthy" words.
We present a novel unsupervised framework for focused meeting summarization that views the problem as an instance of relation extraction.
Existing timeline generation systems for complex events consider only information from traditional media, ignoring the rich social context provided by user-generated content that reveals representative public interests or insightful opinions.
We investigate the novel task of online dispute detection and propose a sentiment analysis solution to the problem: we aim to identify the sequence of sentence-level sentiments expressed during a discussion and to use them as features in a classifier that predicts the DISPUTE/NON-DISPUTE label for the discussion as a whole.
For example, the isotonic CRF model achieves F1 scores of 0. 74 and 0. 67 for agreement and disagreement detection, when a linear chain CRF obtains 0. 58 and 0. 56 for the discussions on Wikipedia Talk pages.