Search Results for author: Steven Hoi

Found 46 papers, 20 papers with code

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

6 code implementations • 28 Jan 2022 • Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi

Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision.

Ranked #3 on Open Vocabulary Attribute Detection on OVAD-Box benchmark (using extra training data)

Image Captioning Image-text matching +5

124,793

Paper
Code

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

12 code implementations • 30 Jan 2023 • Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi

The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.

Ranked #1 on Image Retrieval on Flickr30k

Generative Visual Question Answering Image Captioning +10

124,793

Paper
Code

RegNet: Self-Regulated Network for Image Classification

14 code implementations • 3 Jan 2021 • Jing Xu, Yu Pan, Xinglin Pan, Steven Hoi, Zhang Yi, Zenglin Xu

The ResNet and its variants have achieved remarkable successes in various computer vision tasks.

Ranked #3 on Medical Image Classification on NCT-CRC-HE-100K

General Classification Image Classification +1

60,884

Paper
Code

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

5 code implementations • NeurIPS 2021 • Junnan Li, Ramprasaath R. Selvaraju, Akhilesh Deepak Gotmare, Shafiq Joty, Caiming Xiong, Steven Hoi

Most existing methods employ a transformer-based multimodal encoder to jointly model visual tokens (region-based image features) and word tokens.

Ranked #5 on Open Vocabulary Attribute Detection on OVAD-Box benchmark (using extra training data)

Grounded language learning Image-text matching +8

8,713

Paper
Code

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

2 code implementations • NeurIPS 2023 • Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi

Large-scale pre-training and instruction tuning have been successful at creating general-purpose language models with broad competence.

Ranked #5 on visual instruction following on LLaVA-Bench

Video Question Answering visual instruction following +1

8,713

Paper
Code

Merlion: A Machine Learning Library for Time Series

2 code implementations • 20 Sep 2021 • Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, Huan Wang

We introduce Merlion, an open-source machine learning library for time series.

Anomaly Detection BIG-bench Machine Learning +2

3,257

Paper
Code

ETSformer: Exponential Smoothing Transformers for Time-series Forecasting

2 code implementations • 3 Feb 2022 • Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

Transformers have been actively studied for time-series forecasting in recent years.

Time Series Time Series Forecasting

660

Paper
Code

LogAI: A Library for Log Analytics and Intelligence

1 code implementation • 31 Jan 2023 • Qian Cheng, Amrita Saha, Wenzhuo Yang, Chenghao Liu, Doyen Sahoo, Steven Hoi

In order to enable users to perform multiple types of AI-based log analysis tasks in a uniform manner, we introduce LogAI (https://github. com/salesforce/logai), a one-stop open source library for log analytics and intelligence.

Anomaly Detection Log Parsing +2

345

Paper
Code

Learning Deep Time-index Models for Time Series Forecasting

1 code implementation • 13 Jul 2022 • Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

Deep learning has been actively applied to time series forecasting, leading to a deluge of new methods, belonging to the class of historical-value models.

Inductive Bias Meta-Learning +2

339

Paper
Code

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

1 code implementation • EMNLP 2020 • Chien-Sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong

The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice.

Dialogue State Tracking Intent Detection +3

285

Paper
Code

CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting

1 code implementation • ICLR 2022 • Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

Motivated by the recent success of representation learning in computer vision and natural language processing, we argue that a more promising paradigm for time series forecasting, is to first learn disentangled feature representations, followed by a simple regression fine-tuning step -- we justify such a paradigm from a causal perspective.

Contrastive Learning Representation Learning +2

209

Paper
Code

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

3 code implementations • ICCV 2021 • Junnan Li, Caiming Xiong, Steven Hoi

CoMatch jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings.

Ranked #2 on Semi-Supervised Image Classification on CIFAR-10, 20 Labels

Contrastive Learning Representation Learning +2

121

Paper
Code

BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog Systems

1 code implementation • 22 Nov 2022 • Guangsen Wang, Samson Tan, Shafiq Joty, Gang Wu, Jimmy Au, Steven Hoi

We have open-sourced the toolkit at https://github. com/salesforce/botsim

112

Paper
Code

BotSIM: An End-to-End Bot Simulation Toolkit for Commercial Task-Oriented Dialog Systems

1 code implementation • 29 Nov 2022 • Guangsen Wang, Shafiq Joty, Junnan Li, Steven Hoi

BotSIM adopts a layered design comprising the infrastructure layer, the adaptor layer and the application layer.

User Simulation

112

Paper
Code

Classification Calibration for Long-tail Instance Segmentation

1 code implementation • 29 Oct 2019 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Jun Hao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

In this report, we investigate the performance drop phenomenon of state-of-the-art two-stage instance segmentation models when processing extreme long-tail training data based on the LVIS [5] dataset, and find a major cause is the inaccurate classification of object proposals.

Classification General Classification +3

100

Paper
Code

The Devil is in Classification: A Simple Framework for Long-tail Object Detection and Instance Segmentation

1 code implementation • ECCV 2020 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng

Specifically, we systematically investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset, and unveil that a major cause is the inaccurate classification of object proposals.

General Classification Instance Segmentation +4

100

Paper
Code

DualNet: Continual Learning, Fast and Slow

1 code implementation • NeurIPS 2021 • Quang Pham, Chenghao Liu, Steven Hoi

According to Complementary Learning Systems (CLS) theory~\citep{mcclelland1995there} in neuroscience, humans do effective \emph{continual learning} through two complementary systems: a fast learning system centered on the hippocampus for rapid learning of the specifics and individual experiences, and a slow learning system located in the neocortex for the gradual acquisition of structured knowledge about the environment.

Continual Learning Hippocampus +2

Paper
Code

HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts

1 code implementation • 12 Dec 2023 • Giang Do, Khiem Le, Quang Pham, TrungTin Nguyen, Thanh-Nam Doan, Bint T. Nguyen, Chenghao Liu, Savitha Ramasamy, XiaoLi Li, Steven Hoi

By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models.

Paper
Code

Continual Normalization: Rethinking Batch Normalization for Online Continual Learning

1 code implementation • ICLR 2022 • Quang Pham, Chenghao Liu, Steven Hoi

Existing continual learning methods use Batch Normalization (BN) to facilitate training and improve generalization across tasks.

Continual Learning

Paper
Code

Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

1 code implementation • 28 Oct 2023 • Hailin Chen, Amrita Saha, Steven Hoi, Shafiq Joty

With the rise of powerful closed-sourced LLMs (ChatGPT, GPT-4), there are increasing interests in distilling the capabilies of close-sourced LLMs to smaller open-sourced LLMs.

Ranked #48 on Code Generation on HumanEval

Code Generation

Paper
Code

An Incremental Path-Following Splitting Method for Linearly Constrained Nonconvex Nonsmooth Programs

no code implementations • 30 Jan 2018 • Linbo Qiao, Wei Liu, Steven Hoi

The stationary point of Problem 2 is NOT the stationary point of Problem 1.

Paper
Add Code

Active Learning with Expert Advice

no code implementations • 26 Sep 2013 • Peilin Zhao, Steven Hoi, Jinfeng Zhuang

In this paper, we address a new problem of active learning with expert advice, where the outcome of an instance is disclosed only when it is requested by the online learner.

Active Learning

Paper
Add Code

Question-Guided Hybrid Convolution for Visual Question Answering

no code implementations • ECCV 2018 • Peng Gao, Pan Lu, Hongsheng Li, Shuang Li, Yikang Li, Steven Hoi, Xiaogang Wang

Most state-of-the-art VQA methods fuse the high-level textual and visual features from the neural network and abandon the visual spatial information when learning multi-modal features. To address these problems, question-guided kernels generated from the input question are designed to convolute with visual features for capturing the textual and visual relationship in the early stage.

Ranked #14 on Visual Question Answering (VQA) on CLEVR

Question Answering Visual Question Answering

Paper
Add Code

Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

no code implementations • 13 Dec 2018 • Gao Peng, Zhengkai Jiang, Haoxuan You, Pan Lu, Steven Hoi, Xiaogang Wang, Hongsheng Li

It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering.

Question Answering Visual Question Answering

Paper
Add Code

DART: Domain-Adversarial Residual-Transfer Networks for Unsupervised Cross-Domain Image Classification

no code implementations • 30 Dec 2018 • Xianghong Fang, Haoli Bai, Ziyi Guo, Bin Shen, Steven Hoi, Zenglin Xu

In this paper, we propose a new unsupervised domain adaptation method named Domain-Adversarial Residual-Transfer (DART) learning of Deep Neural Networks to tackle cross-domain image classification tasks.

Classification General Classification +2

Paper
Add Code

SOLAR: Scalable Online Learning Algorithms for Ranking

no code implementations • IJCNLP 2015 • Jialei Wang, Ji Wan, Yongdong Zhang, Steven Hoi

Collaborative Filtering Information Retrieval +1

Paper
Add Code

Distilled Siamese Networks for Visual Tracking

no code implementations • 24 Jul 2019 • Jianbing Shen, Yuanpei Liu, Xingping Dong, Xiankai Lu, Fahad Shahbaz Khan, Steven Hoi

This model is intuitively inspired by the one teacher vs. multiple students learning method typically employed in schools.

Knowledge Distillation Object Tracking +1

Paper
Add Code

Towards Noise-resistant Object Detection with Noisy Annotations

no code implementations • 3 Mar 2020 • Junnan Li, Caiming Xiong, Richard Socher, Steven Hoi

We address the challenging problem of training object detectors with noisy annotations, where the noise contains a mixture of label noise and bounding box noise.

Object object-detection +1

Paper
Add Code

Extreme Low-Light Imaging with Multi-granulation Cooperative Networks

no code implementations • 16 May 2020 • Keqi Wang, Peng Gao, Steven Hoi, Qian Guo, Yuhua Qian

Low-light imaging is challenging since images may appear to be dark and noised due to low signal-to-noise ratio, complex image content, and the variety in shooting scenes in extreme low-light condition.

Paper
Add Code

Partially Observable Online Change Detection via Smooth-Sparse Decomposition

no code implementations • 22 Sep 2020 • Jie Guo, Hao Yan, Chen Zhang, Steven Hoi

We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities.

Bayesian Inference Change Detection +1

Paper
Add Code

Localized Meta-Learning: A PAC-Bayes Analysis for Meta-Learning Beyond Global Prior

no code implementations • 1 Jan 2021 • Chenghao Liu, Tao Lu, Doyen Sahoo, Yuan Fang, Kun Zhang, Steven Hoi

Meta-learning methods learn the meta-knowledge among various training tasks and aim to promote the learning of new tasks under the task similarity assumption.

Meta-Learning

Paper
Add Code

Online Continual Learning Under Domain Shift

no code implementations • 1 Jan 2021 • Quang Pham, Chenghao Liu, Steven Hoi

CIER employs an adversarial training to correct the shift in $P(X, Y)$ by matching $P(X|Y)$, which results in an invariant representation that can generalize to unseen domains during inference.

Continual Learning

Paper
Add Code

Contextual Transformation Networks for Online Continual Learning

no code implementations • ICLR 2021 • Quang Pham, Chenghao Liu, Doyen Sahoo, Steven Hoi

Continual learning methods with fixed architectures rely on a single network to learn models that can perform well on all tasks.

Continual Learning Transfer Learning

Paper
Add Code

VilNMN: A Neural Module Network approach to Video-Grounded Language Tasks

no code implementations • 1 Jan 2021 • Hung Le, Nancy F. Chen, Steven Hoi

Neural module networks (NMN) have achieved success in image-grounded tasks such as question answering (QA) on synthetic images.

Information Retrieval Question Answering +1

Paper
Add Code

PolarNet: Learning to Optimize Polar Keypoints for Keypoint Based Object Detection

no code implementations • ICLR 2021 • Wu Xiongwei, Doyen Sahoo, Steven Hoi

Despite achieving promising performance at par with anchor-based detectors, the existing anchor-free detectors such as FCOS or CenterNet predict objects based on standard Cartesian coordinates, which often yield poor quality keypoints.

object-detection Object Detection

Paper
Add Code

Noise-Robust Contrastive Learning

no code implementations • 1 Jan 2021 • Junnan Li, Caiming Xiong, Steven Hoi

In contrast to most existing methods, we combat noise by learning robust representation.

Contrastive Learning

Paper
Add Code

Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning

no code implementations • NeurIPS 2020 • Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E

The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD.

Paper
Add Code

Improving Limited Labeled Dialogue State Tracking with Self-Supervision

no code implementations • Findings of the Association for Computational Linguistics 2020 • Chien-Sheng Wu, Steven Hoi, Caiming Xiong

We present and investigate two self-supervised objectives: preserving latent consistency and modeling conversational behavior.

Dialogue State Tracking

Paper
Add Code

Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition

no code implementations • 3 Dec 2020 • Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven Hoi

One crucial challenge of real-world multilingual speech recognition is the long-tailed distribution problem, where some resource-rich languages like English have abundant training data, but a long tail of low-resource languages have varying amounts of limited training data.

Language Modelling Multi-Task Learning +2

Paper
Add Code

Detection and Rectification of Arbitrary Shaped Scene Texts by using Text Keypoints and Links

no code implementations • 1 Mar 2021 • Chuhui Xue, Shijian Lu, Steven Hoi

Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc.

Scene Text Detection Text Detection

Paper
Add Code

A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning

no code implementations • NeurIPS 2021 • Pan Zhou, Caiming Xiong, Xiao-Tong Yuan, Steven Hoi

Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query.

Contrastive Learning Representation Learning +2

Paper
Add Code

Attention-based Feature Aggregation

no code implementations • 29 Sep 2021 • Xiongwei Wu, Ee-Peng Lim, Steven Hoi, Qianru Sun

To implement this module, we define two variants of attention: self-attention on the summed-up feature map, and cross-attention between two feature maps before summed up.

Instance Segmentation object-detection +2

Paper
Add Code

VGNMN: Video-grounded Neural Module Networks for Video-Grounded Dialogue Systems

no code implementations • NAACL 2022 • Hung Le, Nancy Chen, Steven Hoi

Neural module networks (NMN) have achieved success in image-grounded tasks such as Visual Question Answering (VQA) on synthetic images.

Information Retrieval Question Answering +2

Paper
Add Code

Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT5

no code implementations • 27 Nov 2022 • Nghi D. Q. Bui, Yue Wang, Steven Hoi

Specifically, we propose three objectives to adapt the generic CodeT5 for debugging: a bug detection objective to determine whether a given code snippet is buggy or not, a bug localization objective to identify the buggy lines, and a program repair objective to translate the buggy code to its fixed version.

Bug fixing Language Modelling +1

Paper
Add Code

From Images to Textual Prompts: Zero-Shot Visual Question Answering With Frozen Large Language Models

no code implementations • CVPR 2023 • Jiaxian Guo, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li, DaCheng Tao, Steven Hoi

To address this issue, we propose Img2Prompt, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training.

Question Answering Visual Question Answering +1

Paper
Add Code

CompeteSMoE - Effective Training of Sparse Mixture of Experts via Competition

no code implementations • 4 Feb 2024 • Quang Pham, Giang Do, Huy Nguyen, TrungTin Nguyen, Chenghao Liu, Mina Sartipi, Binh T. Nguyen, Savitha Ramasamy, XiaoLi Li, Steven Hoi, Nhat Ho

Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.