no code implementations • NAACL 2022 • Hung Le, Nancy Chen, Steven Hoi
Neural module networks (NMN) have achieved success in image-grounded tasks such as Visual Question Answering (VQA) on synthetic images.
1 code implementation • 4 Feb 2024 • Quang Pham, Giang Do, Huy Nguyen, TrungTin Nguyen, Chenghao Liu, Mina Sartipi, Binh T. Nguyen, Savitha Ramasamy, XiaoLi Li, Steven Hoi, Nhat Ho
Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width.
1 code implementation • 12 Dec 2023 • Giang Do, Khiem Le, Quang Pham, TrungTin Nguyen, Thanh-Nam Doan, Bint T. Nguyen, Chenghao Liu, Savitha Ramasamy, XiaoLi Li, Steven Hoi
By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models.
1 code implementation • 28 Oct 2023 • Hailin Chen, Amrita Saha, Steven Hoi, Shafiq Joty
With the rise of powerful closed-sourced LLMs (ChatGPT, GPT-4), there are increasing interests in distilling the capabilies of close-sourced LLMs to smaller open-sourced LLMs.
4 code implementations • NeurIPS 2023 • Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi
Large-scale pre-training and instruction tuning have been successful at creating general-purpose language models with broad competence.
Ranked #5 on
Visual Question Answering
on BenchLMM
1 code implementation • 31 Jan 2023 • Qian Cheng, Amrita Saha, Wenzhuo Yang, Chenghao Liu, Doyen Sahoo, Steven Hoi
In order to enable users to perform multiple types of AI-based log analysis tasks in a uniform manner, we introduce LogAI (https://github. com/salesforce/logai), a one-stop open source library for log analytics and intelligence.
16 code implementations • Conference 2023 • Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.
Ranked #1 on
Image-to-Text Retrieval
on MS COCO
no code implementations • CVPR 2023 • Jiaxian Guo, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li, DaCheng Tao, Steven Hoi
To address this issue, we propose Img2Prompt, a plug-and-play module that provides the prompts that can bridge the aforementioned modality and task disconnections, so that LLMs can perform zero-shot VQA tasks without end-to-end training.
1 code implementation • 29 Nov 2022 • Guangsen Wang, Shafiq Joty, Junnan Li, Steven Hoi
BotSIM adopts a layered design comprising the infrastructure layer, the adaptor layer and the application layer.
no code implementations • 27 Nov 2022 • Nghi D. Q. Bui, Yue Wang, Steven Hoi
Specifically, we propose three objectives to adapt the generic CodeT5 for debugging: a bug detection objective to determine whether a given code snippet is buggy or not, a bug localization objective to identify the buggy lines, and a program repair objective to translate the buggy code to its fixed version.
1 code implementation • 22 Nov 2022 • Guangsen Wang, Samson Tan, Shafiq Joty, Gang Wu, Jimmy Au, Steven Hoi
We have open-sourced the toolkit at https://github. com/salesforce/botsim
1 code implementation • 13 Jul 2022 • Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi
Deep learning has been actively applied to time series forecasting, leading to a deluge of new methods, belonging to the class of historical-value models.
1 code implementation • ICLR 2022 • Quang Pham, Chenghao Liu, Steven Hoi
Existing continual learning methods use Batch Normalization (BN) to facilitate training and improve generalization across tasks.
1 code implementation • ICLR 2022 • Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi
Motivated by the recent success of representation learning in computer vision and natural language processing, we argue that a more promising paradigm for time series forecasting, is to first learn disentangled feature representations, followed by a simple regression fine-tuning step -- we justify such a paradigm from a causal perspective.
2 code implementations • 3 Feb 2022 • Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi
Transformers have been actively studied for time-series forecasting in recent years.
9 code implementations • 28 Jan 2022 • Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision.
Ranked #3 on
Open Vocabulary Attribute Detection
on OVAD-Box benchmark
(using extra training data)
1 code implementation • NeurIPS 2021 • Quang Pham, Chenghao Liu, Steven Hoi
According to Complementary Learning Systems (CLS) theory~\citep{mcclelland1995there} in neuroscience, humans do effective \emph{continual learning} through two complementary systems: a fast learning system centered on the hippocampus for rapid learning of the specifics and individual experiences, and a slow learning system located in the neocortex for the gradual acquisition of structured knowledge about the environment.
no code implementations • 29 Sep 2021 • Xiongwei Wu, Ee-Peng Lim, Steven Hoi, Qianru Sun
To implement this module, we define two variants of attention: self-attention on the summed-up feature map, and cross-attention between two feature maps before summed up.
2 code implementations • 20 Sep 2021 • Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, Huan Wang
We introduce Merlion, an open-source machine learning library for time series.
6 code implementations • NeurIPS 2021 • Junnan Li, Ramprasaath R. Selvaraju, Akhilesh Deepak Gotmare, Shafiq Joty, Caiming Xiong, Steven Hoi
Most existing methods employ a transformer-based multimodal encoder to jointly model visual tokens (region-based image features) and word tokens.
Ranked #5 on
Open Vocabulary Attribute Detection
on OVAD-Box benchmark
(using extra training data)
no code implementations • NeurIPS 2021 • Pan Zhou, Caiming Xiong, Xiao-Tong Yuan, Steven Hoi
Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query.
no code implementations • 1 Mar 2021 • Chuhui Xue, Shijian Lu, Steven Hoi
Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc.
14 code implementations • 3 Jan 2021 • Jing Xu, Yu Pan, Xinglin Pan, Steven Hoi, Zhang Yi, Zenglin Xu
The ResNet and its variants have achieved remarkable successes in various computer vision tasks.
Ranked #3 on
Medical Image Classification
on NCT-CRC-HE-100K
no code implementations • ICLR 2021 • Wu Xiongwei, Doyen Sahoo, Steven Hoi
Despite achieving promising performance at par with anchor-based detectors, the existing anchor-free detectors such as FCOS or CenterNet predict objects based on standard Cartesian coordinates, which often yield poor quality keypoints.
no code implementations • 1 Jan 2021 • Chenghao Liu, Tao Lu, Doyen Sahoo, Yuan Fang, Kun Zhang, Steven Hoi
Meta-learning methods learn the meta-knowledge among various training tasks and aim to promote the learning of new tasks under the task similarity assumption.
no code implementations • 1 Jan 2021 • Quang Pham, Chenghao Liu, Steven Hoi
CIER employs an adversarial training to correct the shift in $P(X, Y)$ by matching $P(X|Y)$, which results in an invariant representation that can generalize to unseen domains during inference.
no code implementations • ICLR 2021 • Quang Pham, Chenghao Liu, Doyen Sahoo, Steven Hoi
Continual learning methods with fixed architectures rely on a single network to learn models that can perform well on all tasks.
no code implementations • 1 Jan 2021 • Hung Le, Nancy F. Chen, Steven Hoi
Neural module networks (NMN) have achieved success in image-grounded tasks such as question answering (QA) on synthetic images.
no code implementations • 1 Jan 2021 • Junnan Li, Caiming Xiong, Steven Hoi
In contrast to most existing methods, we combat noise by learning robust representation.
no code implementations • 3 Dec 2020 • Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven Hoi
One crucial challenge of real-world multilingual speech recognition is the long-tailed distribution problem, where some resource-rich languages like English have abundant training data, but a long tail of low-resource languages have varying amounts of limited training data.
3 code implementations • ICCV 2021 • Junnan Li, Caiming Xiong, Steven Hoi
CoMatch jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Chien-Sheng Wu, Steven Hoi, Caiming Xiong
We present and investigate two self-supervised objectives: preserving latent consistency and modeling conversational behavior.
no code implementations • NeurIPS 2020 • Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, Weinan E
The result shows that (1) the escaping time of both SGD and ADAM~depends on the Radon measure of the basin positively and the heaviness of gradient noise negatively; (2) for the same basin, SGD enjoys smaller escaping time than ADAM, mainly because (a) the geometry adaptation in ADAM~via adaptively scaling each gradient coordinate well diminishes the anisotropic structure in gradient noise and results in larger Radon measure of a basin; (b) the exponential gradient average in ADAM~smooths its gradient and leads to lighter gradient noise tails than SGD.
no code implementations • 22 Sep 2020 • Jie Guo, Hao Yan, Chen Zhang, Steven Hoi
We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities.
1 code implementation • ECCV 2020 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, Jiashi Feng
Specifically, we systematically investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset, and unveil that a major cause is the inaccurate classification of object proposals.
no code implementations • 16 May 2020 • Keqi Wang, Peng Gao, Steven Hoi, Qian Guo, Yuhua Qian
Low-light imaging is challenging since images may appear to be dark and noised due to low signal-to-noise ratio, complex image content, and the variety in shooting scenes in extreme low-light condition.
1 code implementation • EMNLP 2020 • Chien-Sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice.
no code implementations • 3 Mar 2020 • Junnan Li, Caiming Xiong, Richard Socher, Steven Hoi
We address the challenging problem of training object detectors with noisy annotations, where the noise contains a mixture of label noise and bounding box noise.
1 code implementation • 29 Oct 2019 • Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Jun Hao Liew, Sheng Tang, Steven Hoi, Jiashi Feng
In this report, we investigate the performance drop phenomenon of state-of-the-art two-stage instance segmentation models when processing extreme long-tail training data based on the LVIS [5] dataset, and find a major cause is the inaccurate classification of object proposals.
no code implementations • 24 Jul 2019 • Jianbing Shen, Yuanpei Liu, Xingping Dong, Xiankai Lu, Fahad Shahbaz Khan, Steven Hoi
This model is intuitively inspired by the one teacher vs. multiple students learning method typically employed in schools.
no code implementations • 30 Dec 2018 • Xianghong Fang, Haoli Bai, Ziyi Guo, Bin Shen, Steven Hoi, Zenglin Xu
In this paper, we propose a new unsupervised domain adaptation method named Domain-Adversarial Residual-Transfer (DART) learning of Deep Neural Networks to tackle cross-domain image classification tasks.
no code implementations • 13 Dec 2018 • Gao Peng, Zhengkai Jiang, Haoxuan You, Pan Lu, Steven Hoi, Xiaogang Wang, Hongsheng Li
It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering.
no code implementations • ECCV 2018 • Peng Gao, Pan Lu, Hongsheng Li, Shuang Li, Yikang Li, Steven Hoi, Xiaogang Wang
Most state-of-the-art VQA methods fuse the high-level textual and visual features from the neural network and abandon the visual spatial information when learning multi-modal features. To address these problems, question-guided kernels generated from the input question are designed to convolute with visual features for capturing the textual and visual relationship in the early stage.
no code implementations • 30 Jan 2018 • Linbo Qiao, Wei Liu, Steven Hoi
The stationary point of Problem 2 is NOT the stationary point of Problem 1.
no code implementations • 26 Sep 2013 • Peilin Zhao, Steven Hoi, Jinfeng Zhuang
In this paper, we address a new problem of active learning with expert advice, where the outcome of an instance is disclosed only when it is requested by the online learner.