Search Results for author: Ser-Nam Lim

Found 95 papers, 46 papers with code

A Metric Learning Reality Check

4 code implementations ECCV 2020 Kevin Musgrave, Serge Belongie, Ser-Nam Lim

Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods.

Metric Learning

PyTorch Metric Learning

1 code implementation20 Aug 2020 Kevin Musgrave, Serge Belongie, Ser-Nam Lim

Deep metric learning algorithms have a wide variety of applications, but implementing these algorithms can be tedious and time consuming.

Metric Learning

PyTorch Adapt

2 code implementations28 Nov 2022 Kevin Musgrave, Serge Belongie, Ser-Nam Lim

PyTorch Adapt is a library for domain adaptation, a type of machine learning algorithm that re-purposes existing models to work in new domains.

Domain Adaptation

HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions

7 code implementations28 Jul 2022 Yongming Rao, Wenliang Zhao, Yansong Tang, Jie zhou, Ser-Nam Lim, Jiwen Lu

In this paper, we show that the key ingredients behind the vision Transformers, namely input-adaptive, long-range and high-order spatial interactions, can also be efficiently implemented with a convolution-based framework.

Image Classification Object Detection +2

Visual Prompt Tuning

6 code implementations23 Mar 2022 Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, Ser-Nam Lim

The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning.

Image Classification Long-tail Learning +2

Three New Validators and a Large-Scale Benchmark Ranking for Unsupervised Domain Adaptation

1 code implementation15 Aug 2022 Kevin Musgrave, Serge Belongie, Ser-Nam Lim

In a supervised setting, these validators evaluate checkpoints by computing accuracy on a validation set that has labels.

Unsupervised Domain Adaptation

NeRV: Neural Representations for Videos

3 code implementations NeurIPS 2021 Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H. 264, HEVC \etc).

Denoising Neural Network Compression +3

Edge Proposal Sets for Link Prediction

2 code implementations30 Jun 2021 Abhay Singh, Qian Huang, Sijia Linda Huang, Omkar Bhalerao, Horace He, Ser-Nam Lim, Austin R. Benson

Here, we demonstrate how simply adding a set of edges, which we call a \emph{proposal set}, to the graph as a pre-processing step can improve the performance of several link prediction algorithms.

Experimental Design Link Prediction +1

Unconstrained Facial Expression Transfer using Style-based Generator

1 code implementation12 Dec 2019 Chao Yang, Ser-Nam Lim

Given two face images, our method can create plausible results that combine the appearance of one image and the expression of the other.

Image Manipulation

RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness

2 code implementations29 Jun 2022 Francesco Pinto, Harry Yang, Ser-Nam Lim, Philip H. S. Torr, Puneet K. Dokania

We show that the effectiveness of the well celebrated Mixup [Zhang et al., 2018] can be further improved if instead of using it as the sole learning objective, it is utilized as an additional regularizer to the standard cross-entropy loss.

Out-of-Distribution Detection

On Feature Normalization and Data Augmentation

1 code implementation CVPR 2021 Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger

The moments (a. k. a., mean and standard deviation) of latent features are often removed as noise when training image recognition models, to increase stability and reduce training time.

Data Augmentation Domain Generalization +2

Enhancing Adversarial Example Transferability with an Intermediate Level Attack

2 code implementations ICCV 2019 Qian Huang, Isay Katsman, Horace He, Zeqi Gu, Serge Belongie, Ser-Nam Lim

We show that we can select a layer of the source model to perturb without any knowledge of the target models while achieving high transferability.

What makes fake images detectable? Understanding properties that generalize

1 code implementation ECCV 2020 Lucy Chai, David Bau, Ser-Nam Lim, Phillip Isola

The quality of image generation and manipulation is reaching impressive levels, making it increasingly difficult for a human to distinguish between what is real and what is fake.

Image Generation

New Benchmarks for Learning on Non-Homophilous Graphs

1 code implementation3 Apr 2021 Derek Lim, Xiuyu Li, Felix Hohne, Ser-Nam Lim

Much data with graph structures satisfy the principle of homophily, meaning that connected nodes tend to be similar with respect to a specific attribute.

Attribute Fraud Detection +3

Neural Manifold Ordinary Differential Equations

3 code implementations NeurIPS 2020 Aaron Lou, Derek Lim, Isay Katsman, Leo Huang, Qingxuan Jiang, Ser-Nam Lim, Christopher De Sa

To better conform to data geometry, recent deep generative modelling techniques adapt Euclidean constructions to non-Euclidean spaces.

Density Estimation

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

1 code implementation8 Apr 2024 Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim

However, existing LLM-based large multimodal models (e. g., Video-LLaMA, VideoChat) can only take in a limited number of frames for short video understanding.

Question Answering Video Captioning +4

M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection

1 code implementation20 Apr 2021 Junke Wang, Zuxuan Wu, Wenhao Ouyang, Xintong Han, Jingjing Chen, Ser-Nam Lim, Yu-Gang Jiang

The widespread dissemination of Deepfakes demands effective approaches that can detect perceptually convincing forged images.

DeepFake Detection Face Swapping +1

Graph Inductive Biases in Transformers without Message Passing

1 code implementation27 May 2023 Liheng Ma, Chen Lin, Derek Lim, Adriana Romero-Soriano, Puneet K. Dokania, Mark Coates, Philip Torr, Ser-Nam Lim

Graph inductive biases are crucial for Graph Transformers, and previous works incorporate them using message-passing modules and/or positional encodings.

Graph Classification Graph Regression +2

Differentiating through the Fréchet Mean

2 code implementations ICML 2020 Aaron Lou, Isay Katsman, Qingxuan Jiang, Serge Belongie, Ser-Nam Lim, Christopher De Sa

Recent advances in deep representation learning on Riemannian manifolds extend classical deep learning operations to better capture the geometry of the manifold.

Representation Learning

Rethinking Nearest Neighbors for Visual Classification

1 code implementation15 Dec 2021 Menglin Jia, Bor-Chun Chen, Zuxuan Wu, Claire Cardie, Serge Belongie, Ser-Nam Lim

In this paper, we investigate $k$-Nearest-Neighbor (k-NN) classifiers, a classical model-free learning method from the pre-deep learning era, as an augmentation to modern neural network based approaches.

Classification

Intentonomy: a Dataset and Study towards Human Intent Understanding

1 code implementation CVPR 2021 Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim

Based on our findings, we conduct further study to quantify the effect of attending to object and context classes as well as textual information in the form of hashtags when training an intent classifier.

Better Set Representations For Relational Reasoning

1 code implementation NeurIPS 2020 Qian Huang, Horace He, Abhay Singh, Yan Zhang, Ser-Nam Lim, Austin Benson

Incorporating relational reasoning into neural networks has greatly expanded their capabilities and scope.

Relational Reasoning

Towards Adversarial Evaluations for Inexact Machine Unlearning

3 code implementations17 Jan 2022 Shashwat Goel, Ameya Prabhu, Amartya Sanyal, Ser-Nam Lim, Philip Torr, Ponnurangam Kumaraguru

Machine Learning models face increased concerns regarding the storage of personal user data and adverse impacts of corrupted data like backdoors or systematic bias.

Machine Unlearning Memorization

Spartan: Differentiable Sparsity via Regularized Transportation

1 code implementation27 May 2022 Kai Sheng Tai, Taipeng Tian, Ser-Nam Lim

We present Spartan, a method for training sparse neural network models with a predetermined level of sparsity.

Network Pruning

Test-Time Distribution Normalization for Contrastively Learned Vision-language Models

2 code implementations22 Feb 2023 Yifei Zhou, Juntao Ren, Fengyu Li, Ramin Zabih, Ser-Nam Lim

Advances in the field of vision-language contrastive learning have made it possible for many downstream applications to be carried out efficiently and accurately by simply taking the dot product between image and text representations.

Contrastive Learning

Computationally Budgeted Continual Learning: What Does Matter?

1 code implementation CVPR 2023 Ameya Prabhu, Hasan Abed Al Kader Hammoud, Puneet Dokania, Philip H. S. Torr, Ser-Nam Lim, Bernard Ghanem, Adel Bibi

Our conclusions are consistent in a different number of stream time steps, e. g., 20 to 200, and under several computational budgets.

Continual Learning

Deep Co-Training with Task Decomposition for Semi-Supervised Domain Adaptation

1 code implementation ICCV 2021 Luyu Yang, Yan Wang, Mingfei Gao, Abhinav Shrivastava, Kilian Q. Weinberger, Wei-Lun Chao, Ser-Nam Lim

To integrate the strengths of the two classifiers, we apply the well-established co-training framework, in which the two classifiers exchange their high confident predictions to iteratively "teach each other" so that both classifiers can excel in the target domain.

Semi-supervised Domain Adaptation Unsupervised Domain Adaptation

Measuring Dataset Granularity

1 code implementation21 Dec 2019 Yin Cui, Zeqi Gu, Dhruv Mahajan, Laurens van der Maaten, Serge Belongie, Ser-Nam Lim

We also investigate the interplay between dataset granularity with a variety of factors and find that fine-grained datasets are more difficult to learn from, more difficult to transfer to, more difficult to perform few-shot learning with, and more vulnerable to adversarial attacks.

Clustering Few-Shot Learning

Equivariant Manifold Flows

1 code implementation NeurIPS 2021 Isay Katsman, Aaron Lou, Derek Lim, Qingxuan Jiang, Ser-Nam Lim, Christopher De Sa

Tractably modelling distributions over manifolds has long been an important goal in the natural sciences.

TIPI: Test Time Adaptation With Transformation Invariance

1 code implementation CVPR 2023 A. Tuan Nguyen, Thanh Nguyen-Tang, Ser-Nam Lim, Philip H.S. Torr

Test Time Adaptation offers a means to combat this problem, as it allows the model to adapt during test time to the new data distribution, using only unlabeled test data batches.

Autonomous Driving Test-time Adaptation

Object-Centric Unsupervised Image Captioning

1 code implementation2 Dec 2021 Zihang Meng, David Yang, Xuefei Cao, Ashish Shah, Ser-Nam Lim

Our work in this paper overcomes this by harvesting objects corresponding to a given sentence from the training set, even if they don't belong to the same image.

Image Captioning Object +1

Quantization Guided JPEG Artifact Correction

1 code implementation ECCV 2020 Max Ehrlich, Larry Davis, Ser-Nam Lim, Abhinav Shrivastava

The JPEG image compression algorithm is the most popular method of image compression because of its ability for large compression ratios.

JPEG Artifact Correction Quantization

Sample-dependent Adaptive Temperature Scaling for Improved Calibration

1 code implementation13 Jul 2022 Tom Joy, Francesco Pinto, Ser-Nam Lim, Philip H. S. Torr, Puneet K. Dokania

The most common post-hoc approach to compensate for this is to perform temperature scaling, which adjusts the confidences of the predictions on any input by scaling the logits by a fixed value.

Out of Distribution (OOD) Detection

$BT^2$: Backward-compatible Training with Basis Transformation

1 code implementation8 Nov 2022 Yifei Zhou, Zilu Li, Abhinav Shrivastava, Hengshuang Zhao, Antonio Torralba, Taipeng Tian, Ser-Nam Lim

In this way, the new representation can be directly compared with the old representation, in principle avoiding the need for any backfilling.

Retrieval

Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?

1 code implementation ICCV 2023 Hasan Abed Al Kader Hammoud, Ameya Prabhu, Ser-Nam Lim, Philip H. S. Torr, Adel Bibi, Bernard Ghanem

We revisit the common practice of evaluating adaptation of Online Continual Learning (OCL) algorithms through the metric of online accuracy, which measures the accuracy of the model on the immediate next few samples.

Continual Learning

BT^2: Backward-compatible Training with Basis Transformation

1 code implementation ICCV 2023 Yifei Zhou, Zilu Li, Abhinav Shrivastava, Hengshuang Zhao, Antonio Torralba, Taipeng Tian, Ser-Nam Lim

In this way, the new representation can be directly compared with the old representation, in principle avoiding the need for any backfilling.

Language-free Compositional Action Generation via Decoupling Refinement

1 code implementation7 Jul 2023 Xiao Liu, Guangyi Chen, Yansong Tang, Guangrun Wang, Xiao-Ping Zhang, Ser-Nam Lim

Composing simple elements into complex concepts is crucial yet challenging, especially for 3D action generation.

Action Generation

Adversarial Example Decomposition

no code implementations4 Dec 2018 Horace He, Aaron Lou, Qingxuan Jiang, Isay Katsman, Serge Belongie, Ser-Nam Lim

Research has shown that widely used deep neural networks are vulnerable to carefully crafted adversarial perturbations.

Cross-X Learning for Fine-Grained Visual Categorization

no code implementations ICCV 2019 Wei Luo, Xitong Yang, Xianjie Mo, Yuheng Lu, Larry S. Davis, Jun Li, Jian Yang, Ser-Nam Lim

Recognizing objects from subcategories with very subtle differences remains a challenging task due to the large intra-class and small inter-class variation.

Ranked #18 on Fine-Grained Image Classification on NABirds (using extra training data)

Fine-Grained Image Classification Fine-Grained Visual Categorization

Unsupervised Deep Metric Learning via Auxiliary Rotation Loss

no code implementations16 Nov 2019 Xuefei Cao, Bor-Chun Chen, Ser-Nam Lim

In this work, we propose to generate pseudo-labels for deep metric learning directly from clustering assignment and we introduce unsupervised deep metric learning (UDML) regularized by a self-supervision (SS) task.

Clustering Image Retrieval +3

Fine-grained Synthesis of Unrestricted Adversarial Examples

no code implementations20 Nov 2019 Omid Poursaeed, Tianxing Jiang, Yordanos Goshu, Harry Yang, Serge Belongie, Ser-Nam Lim

We propose a novel approach for generating unrestricted adversarial examples by manipulating fine-grained aspects of image generation.

Image Generation object-detection +2

Deep Multi-Modal Sets

no code implementations3 Mar 2020 Austin Reiter, Menglin Jia, Pu Yang, Ser-Nam Lim

Most deep learning-based methods rely on a late fusion technique whereby multiple feature types are encoded and concatenated and then a multi layer perceptron (MLP) combines the fused embedding to make predictions.

One-Shot Domain Adaptation For Face Generation

no code implementations CVPR 2020 Chao Yang, Ser-Nam Lim

To generate images of the same distribution, we introduce a style-mixing technique that transfers the low-level statistics from the target to faces randomly generated with the model.

Domain Adaptation Face Generation

Detecting Deep-Fake Videos from Appearance and Behavior

no code implementations29 Apr 2020 Shruti Agarwal, Tarek El-Gaaly, Hany Farid, Ser-Nam Lim

Synthetically-generated audios and videos -- so-called deep fakes -- continue to capture the imagination of the computer-graphics and computer-vision communities.

Metric Learning

Curriculum Manager for Source Selection in Multi-Source Domain Adaptation

no code implementations ECCV 2020 Luyu Yang, Yogesh Balaji, Ser-Nam Lim, Abhinav Shrivastava

In this paper, we proposed an adversarial agent that learns a dynamic curriculum for source samples, called Curriculum Manager for Source Selection (CMSS).

Multi-Source Unsupervised Domain Adaptation Unsupervised Domain Adaptation

Analyzing and Mitigating JPEG Compression Defects in Deep Learning

no code implementations17 Nov 2020 Max Ehrlich, Larry Davis, Ser-Nam Lim, Abhinav Shrivastava

We show that there is a significant penalty on common performance metrics for high compression.

GTA: Global Temporal Attention for Video Action Understanding

no code implementations15 Dec 2020 Bo He, Xitong Yang, Zuxuan Wu, Hao Chen, Ser-Nam Lim, Abhinav Shrivastava

To this end, we introduce Global Temporal Attention (GTA), which performs global temporal attention on top of spatial attention in a decoupled manner.

Action Recognition Action Understanding +1

Deep Video Inpainting Detection

no code implementations26 Jan 2021 Peng Zhou, Ning Yu, Zuxuan Wu, Larry S. Davis, Abhinav Shrivastava, Ser-Nam Lim

This paper studies video inpainting detection, which localizes an inpainted region in a video both spatially and temporally.

Video Inpainting

THAT: Two Head Adversarial Training for Improving Robustness at Scale

no code implementations25 Mar 2021 Zuxuan Wu, Tom Goldstein, Larry S. Davis, Ser-Nam Lim

Many variants of adversarial training have been proposed, with most research focusing on problems with relatively few classes.

Vocal Bursts Valence Prediction

Multimodal Fusion Refiner Networks

no code implementations8 Apr 2021 Sethuraman Sankaran, David Yang, Ser-Nam Lim

In this work, we develop a Refiner Fusion Network (ReFNet) that enables fusion modules to combine strong unimodal representation with strong multimodal representations.

Cross-Modal Retrieval Augmentation for Multi-Modal Classification

no code implementations Findings (EMNLP) 2021 Shir Gur, Natalia Neverova, Chris Stauffer, Ser-Nam Lim, Douwe Kiela, Austin Reiter

Recent advances in using retrieval components over external knowledge sources have shown impressive results for a variety of downstream tasks in natural language processing.

Cross-Modal Retrieval General Classification +4

Mix-MaxEnt: Creating High Entropy Barriers To Improve Accuracy and Uncertainty Estimates of Deterministic Neural Networks

no code implementations29 Sep 2021 Francesco Pinto, Harry Yang, Ser-Nam Lim, Philip Torr, Puneet K. Dokania

We propose an extremely simple approach to regularize a single deterministic neural network to obtain improved accuracy and reliable uncertainty estimates.

Testing-Time Adaptation through Online Normalization Estimation

no code implementations29 Sep 2021 Xuefeng Hu, Mustafa Uzunbas, Bor-Chun Chen, Rui Wang, Ashish Shah, Ram Nevatia, Ser-Nam Lim

We present a simple and effective way to estimate the batch-norm statistics during test time, to fast adapt a source model to target test samples.

Test-time Adaptation Unsupervised Domain Adaptation +1

Refining Multimodal Representations using a modality-centric self-supervised module

no code implementations29 Sep 2021 Sethuraman Sankaran, David Yang, Ser-Nam Lim

Tasks that rely on multi-modal information typically include a fusion module that combines information from different modalities.

Few-Shot Learning

Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation

no code implementations9 Oct 2021 Peirong Liu, Rui Wang, Xuefei Cao, Yipin Zhou, Ashish Shah, Ser-Nam Lim

Key findings are twofold: (1) by capturing the motion transfer with an ordinary differential equation (ODE), it helps to regularize the motion field, and (2) by utilizing the source image itself, we are able to inpaint occluded/missing regions arising from large motion changes.

Image Animation Motion Estimation

MixNorm: Test-Time Adaptation Through Online Normalization Estimation

no code implementations21 Oct 2021 Xuefeng Hu, Gokhan Uzunbas, Sirius Chen, Rui Wang, Ashish Shah, Ram Nevatia, Ser-Nam Lim

We present a simple and effective way to estimate the batch-norm statistics during test time, to fast adapt a source model to target test samples.

Test-time Adaptation Unsupervised Domain Adaptation +1

A Frequency Perspective of Adversarial Robustness

no code implementations26 Oct 2021 Shishira R Maiya, Max Ehrlich, Vatsal Agarwal, Ser-Nam Lim, Tom Goldstein, Abhinav Shrivastava

Our analysis shows that adversarial examples are neither in high-frequency nor in low-frequency components, but are simply dataset dependent.

Adversarial Robustness

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

no code implementations CVPR 2022 Lingchen Meng, Hengduo Li, Bor-Chun Chen, Shiyi Lan, Zuxuan Wu, Yu-Gang Jiang, Ser-Nam Lim

To this end, we introduce AdaViT, an adaptive computation framework that learns to derive usage policies on which patches, self-attention heads and transformer blocks to use throughout the backbone on a per-input basis, aiming to improve inference efficiency of vision transformers with a minimal drop of accuracy for image recognition.

Unsupervised Domain Adaptation: A Reality Check

no code implementations30 Nov 2021 Kevin Musgrave, Serge Belongie, Ser-Nam Lim

Interest in unsupervised domain adaptation (UDA) has surged in recent years, resulting in a plethora of new algorithms.

Unsupervised Domain Adaptation

ObjectFormer for Image Manipulation Detection and Localization

no code implementations CVPR 2022 Junke Wang, Zuxuan Wu, Jingjing Chen, Xintong Han, Abhinav Shrivastava, Ser-Nam Lim, Yu-Gang Jiang

Recent advances in image editing techniques have posed serious challenges to the trustworthiness of multimedia data, which drives the research of image tampering detection.

Image Manipulation Image Manipulation Detection

VRAG: Region Attention Graphs for Content-Based Video Retrieval

no code implementations18 May 2022 Kennard Ng, Ser-Nam Lim, Gim Hee Lee

In this paper, we introduce Video Region Attention Graph Networks (VRAG) that improves the state-of-the-art of video-level methods.

Retrieval Video Retrieval

Raising the Bar on the Evaluation of Out-of-Distribution Detection

no code implementations24 Sep 2022 Jishnu Mukhoti, Tsung-Yu Lin, Bor-Chun Chen, Ashish Shah, Philip H. S. Torr, Puneet K. Dokania, Ser-Nam Lim

In this paper, we define 2 categories of OoD data using the subtly different concepts of perceptual/visual and semantic similarity to in-distribution (iD) data.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +2

Diversified Dynamic Routing for Vision Tasks

no code implementations26 Sep 2022 Botos Csaba, Adel Bibi, Yanwei Li, Philip Torr, Ser-Nam Lim

Deep learning models for vision tasks are trained on large datasets under the assumption that there exists a universal representation that can be used to make predictions for all samples.

Instance Segmentation object-detection +2

Totems: Physical Objects for Verifying Visual Integrity

no code implementations26 Sep 2022 Jingwei Ma, Lucy Chai, Minyoung Huh, Tongzhou Wang, Ser-Nam Lim, Phillip Isola, Antonio Torralba

We introduce a new approach to image forensics: placing physical refractive objects, which we call totems, into a scene so as to protect any photograph taken of that scene.

Image Forensics

CNeRV: Content-adaptive Neural Representation for Visual Data

no code implementations18 Nov 2022 Hao Chen, Matt Gwilliam, Bo He, Ser-Nam Lim, Abhinav Shrivastava

We match the performance of NeRV, a state-of-the-art implicit neural representation, on the reconstruction task for frames seen during training while far surpassing for frames that are skipped during training (unseen images).

Data Compression

Unifying Tracking and Image-Video Object Detection

no code implementations20 Nov 2022 Peirong Liu, Rui Wang, Pengchuan Zhang, Omid Poursaeed, Yipin Zhou, Xuefei Cao, Sreya Dutta Roy, Ashish Shah, Ser-Nam Lim

We propose TrIVD (Tracking and Image-Video Detection), the first framework that unifies image OD, video OD, and MOT within one end-to-end model.

Multi-Object Tracking Object +2

Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning

no code implementations CVPR 2023 Jishnu Mukhoti, Tsung-Yu Lin, Omid Poursaeed, Rui Wang, Ashish Shah, Philip H. S. Torr, Ser-Nam Lim

We introduce Patch Aligned Contrastive Learning (PACL), a modified compatibility function for CLIP's contrastive loss, intending to train an alignment between the patch tokens of the vision encoder and the CLS token of the text encoder.

Contrastive Learning Image Classification +5

Towards Scalable Neural Representation for Diverse Videos

no code implementations CVPR 2023 Bo He, Xitong Yang, Hanyu Wang, Zuxuan Wu, Hao Chen, Shuaiyi Huang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e. g., NeRV, E-NeRV).

Action Recognition Video Compression

LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak Supervision

no code implementations15 Apr 2023 Jiani Huang, Ziyang Li, Mayur Naik, Ser-Nam Lim

We propose LASER, a neuro-symbolic approach to learn semantic video representations that capture rich spatial and temporal properties in video data by leveraging high-level logic specifications.

Retrieval Video Captioning +2

Stable Estimation of Survival Causal Effects

no code implementations1 Oct 2023 Khiem Pham, David A. Hirshberg, Phuong-Mai Huynh-Pham, Michele Santacatterina, Ser-Nam Lim, Ramin Zabih

Our experiments on synthetic and semi-synthetic data demonstrate that our method has competitive bias and smaller variance than debiased machine learning approaches.

From Categories to Classifier: Name-Only Continual Learning by Exploring the Web

no code implementations19 Nov 2023 Ameya Prabhu, Hasan Abed Al Kader Hammoud, Ser-Nam Lim, Bernard Ghanem, Philip H. S. Torr, Adel Bibi

Continual Learning (CL) often relies on the availability of extensive annotated datasets, an assumption that is unrealistically time-consuming and costly in practice.

Continual Learning Image Classification +1

CLAMP: Contrastive LAnguage Model Prompt-tuning

no code implementations4 Dec 2023 Piotr Teterwak, Ximeng Sun, Bryan A. Plummer, Kate Saenko, Ser-Nam Lim

Our results show that LLMs can, indeed, achieve good image classification performance when adapted this way.

Contrastive Learning Image Captioning +5

Label Delay in Continual Learning

no code implementations1 Dec 2023 Botos Csaba, Wenxuan Zhang, Matthias Müller, Ser-Nam Lim, Mohamed Elhoseiny, Philip Torr, Adel Bibi

We introduce a new continual learning framework with explicit modeling of the label delay between data and label streams over time steps.

Continual Learning

Video Dynamics Prior: An Internal Learning Approach for Robust Video Enhancements

no code implementations NeurIPS 2023 Gaurav Shrivastava, Ser-Nam Lim, Abhinav Shrivastava

In this paper, we present a novel robust framework for low-level vision tasks, including denoising, object removal, frame interpolation, and super-resolution, that does not require any external training data corpus.

Denoising Super-Resolution

Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

no code implementations19 Dec 2023 Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi

In this work, we introduce VistaLLM, a powerful visual system that addresses coarse- and fine-grained VL tasks over single and multiple input images using a unified framework.

Attribute Language Modelling +1

Universal Pyramid Adversarial Training for Improved ViT Performance

no code implementations26 Dec 2023 Ping-Yeh Chiang, Yipin Zhou, Omid Poursaeed, Satya Narayan Shukla, Ashish Shah, Tom Goldstein, Ser-Nam Lim

Recently, Pyramid Adversarial training (Herrmann et al., 2022) has been shown to be very effective for improving clean accuracy and distribution-shift robustness of vision transformers.

Mitigating Dialogue Hallucination for Large Multi-modal Models via Adversarial Instruction Tuning

no code implementations15 Mar 2024 Dongmin Park, Zhaofang Qian, Guangxing Han, Ser-Nam Lim

To precisely measure this, we first present an evaluation benchmark by extending popular multi-modal benchmark datasets with prepended hallucinatory dialogues generated by our novel Adversarial Question Generator, which can automatically generate image-related yet adversarial dialogues by adopting adversarial attacks on LMMs.

Hallucination Instruction Following +1

Cannot find the paper you are looking for? You can Submit a new open access paper.