Search Results for author: Vlad Niculae

Found 51 papers, 21 papers with code

On Target Representation in Continuous-output Neural Machine Translation

no code implementations RepL4NLP (ACL) 2022 Evgeniia Tokarchuk, Vlad Niculae

We explore pretrained embeddings and also introduce knowledge transfer from the discrete Transformer model using embeddings in Euclidean and non-Euclidean spaces.

Audio Generation Machine Translation +4

Sparse and Structured Hopfield Networks

1 code implementation21 Feb 2024 Saul Santos, Vlad Niculae, Daniel McNamee, Andre F. T. Martins

Modern Hopfield networks have enjoyed recent interest due to their connection to attention in transformers.

Multiple Instance Learning Retrieval

On Measuring Context Utilization in Document-Level MT Systems

1 code implementation2 Feb 2024 Wafaa Mohammed, Vlad Niculae

We propose to complement accuracy-based evaluation with measures of context utilization.


The Unreasonable Effectiveness of Random Target Embeddings for Continuous-Output Neural Machine Translation

no code implementations31 Oct 2023 Evgeniia Tokarchuk, Vlad Niculae

Continuous-output neural machine translation (CoNMT) replaces the discrete next-word prediction problem with an embedding prediction.

Machine Translation Translation

Joint Dropout: Improving Generalizability in Low-Resource Neural Machine Translation through Phrase Pair Variables

no code implementations24 Jul 2023 Ali Araabi, Vlad Niculae, Christof Monz

Despite the tremendous success of Neural Machine Translation (NMT), its performance on low-resource language pairs still remains subpar, partly due to the limited ability to handle previously unseen inputs, i. e., generalization.

Low-Resource Neural Machine Translation NMT +1

Two derivations of Principal Component Analysis on datasets of distributions

no code implementations23 Jun 2023 Vlad Niculae

In this brief note, we formulate Principal Component Analysis (PCA) over datasets consisting not of points but of distributions, characterized by their location and covariance.

Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens

no code implementations19 May 2023 David Stap, Vlad Niculae, Christof Monz

We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation.

Machine Translation Transfer Learning +1

DAG Learning on the Permutahedron

1 code implementation27 Jan 2023 Valentina Zantedeschi, Luca Franceschi, Jean Kaddour, Matt J. Kusner, Vlad Niculae

We propose a continuous optimization framework for discovering a latent directed acyclic graph (DAG) from observational data.

Discrete Latent Structure in Neural Networks

no code implementations18 Jan 2023 Vlad Niculae, Caio F. Corro, Nikita Nangia, Tsvetomila Mihaylova, André F. T. Martins

Many types of data from fields including natural language processing, computer vision, and bioinformatics, are well represented by discrete, compositional structures such as trees, sequences, or matchings.

How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?

no code implementations AMTA 2022 Ali Araabi, Christof Monz, Vlad Niculae

While it is often assumed that by using BPE, NMT systems are capable of handling OOV words, the effectiveness of BPE in translating OOV words has not been explicitly measured.

Machine Translation NMT +1

Modeling Structure with Undirected Neural Networks

1 code implementation8 Feb 2022 Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins

In this paper, we combine the representational strengths of factor graphs and of neural networks, proposing undirected neural networks (UNNs): a flexible framework for specifying computations that can be performed in any order.

Dependency Parsing Image Classification

Sparse Communication via Mixed Distributions

1 code implementation ICLR 2022 António Farinhas, Wilker Aziz, Vlad Niculae, André F. T. Martins

Neural networks and other machine learning models compute continuous representations, while humans communicate mostly through discrete symbols.

Sparse Continuous Distributions and Fenchel-Young Losses

1 code implementation4 Aug 2021 André F. T. Martins, Marcos Treviso, António Farinhas, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Mathieu Blondel, Vlad Niculae

In contrast, for finite domains, recent work on sparse alternatives to softmax (e. g., sparsemax, $\alpha$-entmax, and fusedmax), has led to distributions with varying support.

Audio Classification Question Answering +1

Learning Binary Decision Trees by Argmin Differentiation

1 code implementation9 Oct 2020 Valentina Zantedeschi, Matt J. Kusner, Vlad Niculae

We address the problem of learning binary decision trees that partition data for some downstream task.


Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

1 code implementation EMNLP 2020 Tsvetomila Mihaylova, Vlad Niculae, André F. T. Martins

Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data.

Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

1 code implementation NeurIPS 2020 Gonçalo M. Correia, Vlad Niculae, Wilker Aziz, André F. T. Martins

In this paper, we propose a new training strategy which replaces these estimators by an exact yet efficient marginalization.

Sparse and Continuous Attention Mechanisms

2 code implementations NeurIPS 2020 André F. T. Martins, António Farinhas, Marcos Treviso, Vlad Niculae, Pedro M. Q. Aguiar, Mário A. T. Figueiredo

Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e. g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation).

Machine Translation Question Answering +4

Sparse and Structured Visual Attention

1 code implementation13 Feb 2020 Pedro Henrique Martins, Vlad Niculae, Zita Marinho, André Martins

Visual attention mechanisms are widely used in multimodal tasks, as visual question answering (VQA).

Image Captioning Question Answering +1

LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction

1 code implementation ICML 2020 Vlad Niculae, André F. T. Martins

Structured prediction requires manipulating a large number of combinatorial structures, e. g., dependency trees or alignments, either as latent or output variables.

Structured Prediction

Adaptively Sparse Transformers

3 code implementations IJCNLP 2019 Gonçalo M. Correia, Vlad Niculae, André F. T. Martins

Findings of the quantitative and qualitative analysis of our approach include that heads in different layers learn different sparsity preferences and tend to be more diverse in their attention distributions than softmax Transformers.

Machine Translation Translation

Notes on Latent Structure Models and SPIGOT

no code implementations24 Jul 2019 André F. T. Martins, Vlad Niculae

These notes aim to shed light on the recently proposed structured projected intermediate gradient optimization technique (SPIGOT, Peng et al., 2018).

Latent Structure Models for Natural Language Processing

no code implementations ACL 2019 Andr{\'e} F. T. Martins, Tsvetomila Mihaylova, Nikita Nangia, Vlad Niculae

Latent structure models are a powerful tool for modeling compositional data, discovering linguistic structure, and building NLP pipelines.

Language Modelling Machine Translation +4

Learning with Fenchel-Young Losses

3 code implementations8 Jan 2019 Mathieu Blondel, André F. T. Martins, Vlad Niculae

Over the past decades, numerous loss functions have been been proposed for a variety of supervised learning tasks, including regression, classification, ranking, and more generally structured prediction.

Structured Prediction

Interpretable Structure Induction via Sparse Attention

no code implementations WS 2018 Ben Peters, Vlad Niculae, Andr{\'e} F. T. Martins

Neural network methods are experiencing wide adoption in NLP, thanks to their empirical performance on many tasks.

Towards Dynamic Computation Graphs via Sparse Latent Structure

1 code implementation EMNLP 2018 Vlad Niculae, André F. T. Martins, Claire Cardie

Deep NLP models benefit from underlying structures in the data---e. g., parse trees---typically extracted using off-the-shelf parsers.

graph construction

Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms

2 code implementations24 May 2018 Mathieu Blondel, André F. T. Martins, Vlad Niculae

This paper studies Fenchel-Young losses, a generic way to construct convex loss functions from a regularization function.

Multi-output Polynomial Networks and Factorization Machines

no code implementations NeurIPS 2017 Mathieu Blondel, Vlad Niculae, Takuma Otsuka, Naonori Ueda

On recommendation system tasks, we show how to combine our algorithm with a reduction from ordinal regression to multi-output classification and show that the resulting algorithm outperforms simple baselines in terms of ranking accuracy.

General Classification

A Regularized Framework for Sparse and Structured Neural Attention

3 code implementations NeurIPS 2017 Vlad Niculae, Mathieu Blondel

Modern neural networks are often augmented with an attention mechanism, which tells the network where to focus within the input.

Machine Translation Natural Language Inference +3

Argument Mining with Structured SVMs and RNNs

1 code implementation ACL 2017 Vlad Niculae, Joonsuk Park, Claire Cardie

We propose a novel factor graph model for argument mining, designed for settings in which the argumentative relations in a document do not necessarily form a tree structure.

Argument Mining General Classification

Conversational Markers of Constructive Discussions

no code implementations NAACL 2016 Vlad Niculae, Cristian Danescu-Niculescu-Mizil

We exploit conversational patterns reflecting the flow of ideas and the balance between the participants, as well as their linguistic choices.

QUOTUS: The Structure of Political Media Coverage as Revealed by Quoting Patterns

no code implementations6 Apr 2015 Vlad Niculae, Caroline Suen, Justine Zhang, Cristian Danescu-Niculescu-Mizil, Jure Leskovec

By encoding bias patterns in a low-rank space we provide an analysis of the structure of political media coverage.

The Romanian Neuter Examined Through A Two-Gender N-Gram Classification System

no code implementations LREC 2012 Liviu P. Dinu, Vlad Niculae, Octavia-Maria {\c{S}}ulea

A recent analysis of the Romanian gender system described in (Bateman and Polinsky, 2010), based on older observations, argues that there are two lexically unspecified noun classes in the singular and two different ones in the plural and that what is generally called neuter in Romanian shares the class in the singular with masculines, and the class in the plural with feminines based not only on agreement features but also on form.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.