This paper introduces a new video-and-language dataset with human actions for multimodal logical inference, which focuses on intentional and aspectual expressions that describe dynamic human actions.
In formal semantics, there are two well-developed semantic frameworks: event semantics, which treats verbs and adverbial modifiers using the notion of event, and degree semantics, which analyzes adjectives and comparatives using the notion of degree.
Comparative constructions pose a challenge in Natural Language Inference (NLI), which is the task of determining whether a text entails a hypothesis.
This indicates that the generalization ability of neural models is limited to cases where the syntactic structures are nearly the same as those in the training set.
Monotonicity reasoning is one of the important reasoning skills for any intelligent natural language inference (NLI) model in that it requires the ability to capture the interaction between lexical and syntactic structures.
A large amount of research about multimodal inference across text and vision has been recently developed to obtain visually grounded word and sentence representations.
We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees.
The basic idea is to assign the same type to both declarative sentences and interrogative sentences, partly building on the recent proposal in Inquisitive Semantics.
To investigate this issue, we introduce a new dataset, called HELP, for handling entailments with lexical and logical phenomena.
In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is important for a system to have a large amount of knowledge data.
In this paper, we present a sequence-to-sequence model for generating sentences from logical meaning representations based on event semantics.
How to identify, extract, and use phrasal knowledge is a crucial problem for the task of Recognizing Textual Entailment (RTE).
In formal logic-based approaches to Recognizing Textual Entailment (RTE), a Combinatory Categorial Grammar (CCG) parser is used to parse input premises and hypotheses to obtain their logical formulas.
Determining semantic textual similarity is a core research subject in natural language processing.
We approach the recognition of textual entailment using logical semantic representations and a theorem prover.
This paper proposes a methodology for building a specialized Japanese data set for recognizing temporal relations and discourse relations.