no code implementations • RepL4NLP (ACL) 2022 • Evgeniia Tokarchuk, Vlad Niculae
We explore pretrained embeddings and also introduce knowledge transfer from the discrete Transformer model using embeddings in Euclidean and non-Euclidean spaces.
no code implementations • 19 May 2023 • David Stap, Vlad Niculae, Christof Monz
We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation.
1 code implementation • 27 Jan 2023 • Valentina Zantedeschi, Luca Franceschi, Jean Kaddour, Matt J. Kusner, Vlad Niculae
We propose a continuous optimization framework for discovering a latent directed acyclic graph (DAG) from observational data.
no code implementations • 18 Jan 2023 • Vlad Niculae, Caio F. Corro, Nikita Nangia, Tsvetomila Mihaylova, André F. T. Martins
Many types of data from fields including natural language processing, computer vision, and bioinformatics, are well represented by discrete, compositional structures such as trees, sequences, or matchings.
no code implementations • AMTA 2022 • Ali Araabi, Christof Monz, Vlad Niculae
While it is often assumed that by using BPE, NMT systems are capable of handling OOV words, the effectiveness of BPE in translating OOV words has not been explicitly measured.
1 code implementation • 8 Feb 2022 • Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins
In this paper, we combine the representational strengths of factor graphs and of neural networks, proposing undirected neural networks (UNNs): a flexible framework for specifying computations that can be performed in any order.
1 code implementation • ICLR 2022 • António Farinhas, Wilker Aziz, Vlad Niculae, André F. T. Martins
Neural networks and other machine learning models compute continuous representations, while humans communicate mostly through discrete symbols.
1 code implementation • 4 Aug 2021 • André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae
In contrast, for finite domains, recent work on sparse alternatives to softmax (e. g., sparsemax, $\alpha$-entmax, and fusedmax), has led to distributions with varying support.
1 code implementation • 9 Oct 2020 • Valentina Zantedeschi, Matt J. Kusner, Vlad Niculae
We address the problem of learning binary decision trees that partition data for some downstream task.
1 code implementation • EMNLP 2020 • Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins
Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data.
no code implementations • 28 Sep 2020 • Valentina Zantedeschi, Matt Kusner, Vlad Niculae
In this work we derive a novel sparse relaxation for binary tree learning.
1 code implementation • NeurIPS 2020 • Gonçalo M. Correia, Vlad Niculae, Wilker Aziz, André F. T. Martins
In this paper, we propose a new training strategy which replaces these estimators by an exact yet efficient marginalization.
2 code implementations • NeurIPS 2020 • André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo
Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).
Ranked #36 on
Visual Question Answering (VQA)
on VQA v2 test-std
1 code implementation • 13 Feb 2020 • Pedro Henrique Martins, Vlad Niculae, Zita Marinho, André Martins
Visual attention mechanisms are widely used in multimodal tasks, as visual question answering (VQA).
1 code implementation • ICML 2020 • Vlad Niculae, André F. T. Martins
Structured prediction requires manipulating a large number of combinatorial structures, e. g., dependency trees or alignments, either as latent or output variables.
3 code implementations • IJCNLP 2019 • Gonçalo M. Correia, Vlad Niculae, André F. T. Martins
Findings of the quantitative and qualitative analysis of our approach include that heads in different layers learn different sparsity preferences and tend to be more diverse in their attention distributions than softmax Transformers.
Ranked #1 on
Machine Translation
on IWSLT2017 German-English
no code implementations • 24 Jul 2019 • André F. T. Martins, Vlad Niculae
These notes aim to shed light on the recently proposed structured projected intermediate gradient optimization technique (SPIGOT, Peng et al., 2018).
no code implementations • ACL 2019 • Andr{\'e} F. T. Martins, Tsvetomila Mihaylova, Nikita Nangia, Vlad Niculae
Latent structure models are a powerful tool for modeling compositional data, discovering linguistic structure, and building NLP pipelines.
1 code implementation • ACL 2019 • Ben Peters, Vlad Niculae, André F. T. Martins
Sequence-to-sequence models are a powerful workhorse of NLP.
3 code implementations • 8 Jan 2019 • Mathieu Blondel, André F. T. Martins, Vlad Niculae
Over the past decades, numerous loss functions have been been proposed for a variety of supervised learning tasks, including regression, classification, ranking, and more generally structured prediction.
no code implementations • WS 2018 • Ben Peters, Vlad Niculae, Andr{\'e} F. T. Martins
Neural network methods are experiencing wide adoption in NLP, thanks to their empirical performance on many tasks.
1 code implementation • EMNLP 2018 • Vlad Niculae, André F. T. Martins, Claire Cardie
Deep NLP models benefit from underlying structures in the data---e. g., parse trees---typically extracted using off-the-shelf parsers.
2 code implementations • 24 May 2018 • Mathieu Blondel, André F. T. Martins, Vlad Niculae
This paper studies Fenchel-Young losses, a generic way to construct convex loss functions from a regularization function.
3 code implementations • ICML 2018 • Vlad Niculae, André F. T. Martins, Mathieu Blondel, Claire Cardie
Structured prediction requires searching over a combinatorial number of structures.
3 code implementations • NeurIPS 2017 • Vlad Niculae, Mathieu Blondel
Modern neural networks are often augmented with an attention mechanism, which tells the network where to focus within the input.
no code implementations • NeurIPS 2017 • Mathieu Blondel, Vlad Niculae, Takuma Otsuka, Naonori Ueda
On recommendation system tasks, we show how to combine our algorithm with a reduction from ordinal regression to multi-output classification and show that the resulting algorithm outperforms simple baselines in terms of ranking accuracy.
1 code implementation • ACL 2017 • Vlad Niculae, Joonsuk Park, Claire Cardie
We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure.
no code implementations • NAACL 2016 • Vlad Niculae, Cristian Danescu-Niculescu-Mizil
We exploit conversational patterns reflecting the flow of ideas and the balance between the participants, as well as their linguistic choices.
no code implementations • 2 Feb 2016 • Chenhao Tan, Vlad Niculae, Cristian Danescu-Niculescu-Mizil, Lillian Lee
Changing someone's opinion is arguably one of the most important challenges of social interaction.
no code implementations • IJCNLP 2015 • Vlad Niculae, Srijan Kumar, Jordan Boyd-Graber, Cristian Danescu-Niculescu-Mizil
Interpersonal relations are fickle, with close friendships often dissolving into enmity.
no code implementations • 6 Apr 2015 • Vlad Niculae, Caroline Suen, Justine Zhang, Cristian Danescu-Niculescu-Mizil, Jure Leskovec
By encoding bias patterns in a low-rank space we provide an analysis of the structure of political media coverage.
no code implementations • LREC 2014 • Liviu Dinu, Alina Maria Ciobanu, Ioana Chitoran, Vlad Niculae
We address the task of stress prediction as a sequence tagging problem.
4 code implementations • 1 Sep 2013 • Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake Vanderplas, Arnaud Joly, Brian Holt, Gaël Varoquaux
Scikit-learn is an increasingly popular machine learning li- brary.
no code implementations • LREC 2012 • Liviu P. Dinu, Vlad Niculae, Octavia-Maria {\c{S}}ulea
A recent analysis of the Romanian gender system described in (Bateman and Polinsky, 2010), based on older observations, argues that there are two lexically unspecified noun classes in the singular and two different ones in the plural and that what is generally called neuter in Romanian shares the class in the singular with masculines, and the class in the plural with feminines based not only on agreement features but also on form.