Search Results for author: Long Mai

Found 27 papers, 12 papers with code

SPICED: News Similarity Detection Dataset with Multiple Topics and Complexity Levels

no code implementations • 21 Sep 2023 • Elena Shushkevich, Long Mai, Manuel V. Loureiro, Steven Derby, Tri Kurniawan Wijaya

Nowadays, the use of intelligent systems to detect redundant information in news articles has become especially prevalent with the proliferation of news media outlets in order to enhance user experience.

Paper
Add Code

MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation

no code implementations • 2 Sep 2023 • Hanshu Yan, Jun Hao Liew, Long Mai, Shanchuan Lin, Jiashi Feng

The flexibility of these techniques enables the editing of arbitrary regions within the frame.

Video Editing

Paper
Add Code

Enhancing conversational quality in language learning chatbots: An evaluation of GPT4 for ASR error correction

no code implementations • 19 Jul 2023 • Long Mai, Julie Carson-Berndsen

The integration of natural language processing (NLP) technologies into educational applications has shown promising results, particularly in the language learning domain.

Semantic Textual Similarity STS

Paper
Add Code

Unsupervised domain adaptation for speech recognition with unsupervised error correction

no code implementations • 24 Sep 2022 • Long Mai, Julie Carson-Berndsen

The transcription quality of automatic speech recognition (ASR) systems degrades significantly when transcribing audios coming from unseen domains.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Motion-Adjustable Neural Implicit Video Representation

no code implementations • CVPR 2022 • Long Mai, Feng Liu

The model is trained end-to-end on a video to jointly determine the phase-shift values at each time with the mapping from the phase-shifted sinusoidal functions to the corresponding frame, enabling an implicit video representation.

Motion Magnification

Paper
Add Code

Double Trouble: How to not explain a text classifier's decisions using counterfactuals synthesized by masked language models?

1 code implementation • 22 Oct 2021 • Thang M. Pham, Trung Bui, Long Mai, Anh Nguyen

We find two reasons why IM is not better than LOO: (1) deleting a single word from the input only marginally reduces a classifier's accuracy; and (2) a highly predictable word is always given near-zero attribution, regardless of its true importance to the classifier.

Causal Inference

Paper
Code

Compositional Sketch Search

1 code implementation • 15 Jun 2021 • Alexander Black, Tu Bui, Long Mai, Hailin Jin, John Collomosse

We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects.

Position Quantization +2

Paper
Code

APES: Audiovisual Person Search in Untrimmed Video

1 code implementation • 3 Jun 2021 • Juan Leon Alcazar, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem, Fabian Caba Heilbron

To showcase the potential of our new dataset, we propose an audiovisual baseline and benchmark for person retrieval.

Person Retrieval Person Search +3

Paper
Code

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

1 code implementation • CVPR 2021 • S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağız Aksoy

Neural networks have shown great abilities in estimating depth from a single image.

Ranked #1 on Monocular Depth Estimation on Middlebury 2014

Monocular Depth Estimation

1,443

Paper
Code

Out of Order: How Important Is The Sequential Order of Words in a Sentence in Natural Language Understanding Tasks?

no code implementations • Findings (ACL) 2021 • Thang M. Pham, Trung Bui, Long Mai, Anh Nguyen

Encouraging classifiers to capture word order information improves the performance on most GLUE tasks, SQuAD 2. 0 and out-of-samples.

Natural Language Inference Natural Language Understanding +2

Paper
Add Code

Learning to Recover 3D Scene Shape from a Single Image

1 code implementation • CVPR 2021 • Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen

Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length.

Ranked #1 on Indoor Monocular Depth Estimation on DIODE (using extra training data)

3D Scene Reconstruction Depth Prediction +3

1,027

Paper
Code

Revisiting Adaptive Convolutions for Video Frame Interpolation

no code implementations • 2 Nov 2020 • Simon Niklaus, Long Mai, Oliver Wang

Video frame interpolation, the synthesis of novel views in time, is an increasingly popular research direction with many new papers further advancing the state of the art.

Image Denoising Video Frame Interpolation +1

Paper
Add Code

Active Speakers in Context

1 code implementation • CVPR 2020 • Juan Leon Alcazar, Fabian Caba Heilbron, Long Mai, Federico Perazzi, Joon-Young Lee, Pablo Arbelaez, Bernard Ghanem

Current methods for active speak er detection focus on modeling short-term audiovisual information from a single speaker.

Ranked #15 on Audio-Visual Active Speaker Detection on AVA-ActiveSpeaker

Audio-Visual Active Speaker Detection

Paper
Code

Context-Aware Group Captioning via Self-Attention and Contrastive Features

no code implementations • CVPR 2020 • Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille

In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images.

Image Captioning

Paper
Add Code

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

1 code implementation • NeurIPS 2020 • Thu Nguyen-Phuoc, Christian Richardt, Long Mai, Yong-Liang Yang, Niloy Mitra

Our experiments show that using explicit 3D features to represent objects allows BlockGAN to learn disentangled representations both in terms of objects (foreground and background) and their properties (pose and identity).

Object Representation Learning

Paper
Code

A cost-effective method for improving and re-purposing large, pre-trained GANs by fine-tuning their class-embeddings

no code implementations • 10 Oct 2019 • Qi Li, Long Mai, Michael A. Alcorn, Anh Nguyen

Large, pre-trained generative models have been increasingly popular and useful to both the research and wider communities.

Model Editing

Paper
Add Code

An Internal Learning Approach to Video Inpainting

1 code implementation • ICCV 2019 • Haotian Zhang, Long Mai, Ning Xu, Zhaowen Wang, John Collomosse, Hailin Jin

We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images.

Optical Flow Estimation Video Inpainting

Paper
Code

3D Ken Burns Effect from a Single Image

4 code implementations • 12 Sep 2019 • Simon Niklaus, Long Mai, Jimei Yang, Feng Liu

According to this depth estimate, our framework then maps the input image to a point cloud and synthesizes the resulting video frames by rendering the point cloud from the corresponding camera positions.

Ranked #4 on Depth Estimation on NYU-Depth V2

Depth Estimation Depth Prediction

1,495

Paper
Code

M2KD: Multi-model and Multi-level Knowledge Distillation for Incremental Learning

no code implementations • 3 Apr 2019 • Peng Zhou, Long Mai, Jianming Zhang, Ning Xu, Zuxuan Wu, Larry S. Davis

Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots.

Incremental Learning Knowledge Distillation

Paper
Add Code

Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects

1 code implementation • CVPR 2019 • Michael A. Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, Anh Nguyen

Using our framework and a self-assembled dataset of 3D objects, we investigate the vulnerability of DNNs to OoD poses of well-known objects in ImageNet.

Paper
Code

Interactive Boundary Prediction for Object Selection

no code implementations • ECCV 2018 • Hoang Le, Long Mai, Brian Price, Scott Cohen, Hailin Jin, Feng Liu

Instead of relying on pre-defined low-level image features, our method adaptively predicts object boundaries according to image content and user interactions.

Image Segmentation Interactive Segmentation +3

Paper
Add Code

Video Frame Interpolation via Adaptive Separable Convolution

6 code implementations • ICCV 2017 • Simon Niklaus, Long Mai, Feng Liu

Our method develops a deep fully convolutional neural network that takes two input frames and estimates pairs of 1D kernels for all pixels simultaneously.

Ranked #8 on Video Frame Interpolation on Middlebury

Optical Flow Estimation Video Frame Interpolation

1,009

Paper
Code

Spatial-Semantic Image Search by Visual Feature Synthesis

no code implementations • CVPR 2017 • Long Mai, Hailin Jin, Zhe Lin, Chen Fang, Jonathan Brandt, Feng Liu

We train a convolutional neural network to synthesize appropriate visual features that captures the spatial-semantic constraints from the user canvas query.

Image Retrieval Retrieval

Paper
Add Code

Video Frame Interpolation via Adaptive Convolution

1 code implementation • CVPR 2017 • Simon Niklaus, Long Mai, Feng Liu

Video frame interpolation typically involves two steps: motion estimation and pixel synthesis.

Motion Estimation Optical Flow Estimation +1

Paper
Code

Composition-Preserving Deep Photo Aesthetics Assessment

no code implementations • CVPR 2016 • Long Mai, Hailin Jin, Feng Liu

Deep convolutional neural network (ConvNet) methods have recently shown promising results for aesthetics assessment.

Ranked #6 on Aesthetics Quality Assessment on AVA

Aesthetics Quality Assessment

Paper
Add Code

Kernel Fusion for Better Image Deblurring

no code implementations • CVPR 2015 • Long Mai, Feng Liu

This Gaussian Conditional Random Fields-based kernel fusion method not only models how individual kernels are fused at each kernel element but also the interaction of kernel fusion among multiple kernel elements.

Deblurring Image Deblurring

Paper
Add Code

Saliency Aggregation: A Data-Driven Approach

no code implementations • CVPR 2013 • Long Mai, Yuzhen Niu, Feng Liu

Our idea is to use data-driven approaches to saliency aggregation that appropriately consider the performance gaps among individual methods and the performance dependence of each method on individual images.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.