Search Results for author: Chong-Wah Ngo

Found 40 papers, 16 papers with code

Semi-Supervised Domain Adaptation With Subspace Learning for Visual Recognition

no code implementations • CVPR 2015 • Ting Yao, Yingwei Pan, Chong-Wah Ngo, Houqiang Li, Tao Mei

In many real-world applications, we are often facing the problem of cross domain learning, i. e., to borrow the labeled data or transfer the already learnt knowledge from a source domain to a target domain.

Domain Adaptation Object Recognition +1

Paper
Add Code

Learning Query and Image Similarities With Ranking Canonical Correlation Analysis

no code implementations • ICCV 2015 • Ting Yao, Tao Mei, Chong-Wah Ngo

One of the fundamental problems in image search is to learn the ranking functions, i. e., similarity between the query and image.

Image Retrieval

Paper
Add Code

Boost K-Means

no code implementations • 8 Oct 2016 • Wan-Lei Zhao, Cheng-Hao Deng, Chong-Wah Ngo

The performance of k-means has been enhanced from different perspectives over the years.

Clustering Image Clustering +1

Paper
Add Code

McKernel: A Library for Approximate Kernel Expansions in Log-linear Time

1 code implementation • 27 Feb 2017 • Joachim D. Curtó, Irene C. Zarza, Feng Yang, Alexander J. Smola, Fernando de la Torre, Chong-Wah Ngo, Luc van Gool

The algorithm requires to compute the product of Walsh Hadamard Transform (WHT) matrices.

General Classification

Paper
Code

Approximate k-NN Graph Construction: a Generic Online Approach

no code implementations • 9 Apr 2018 • Wan-Lei Zhao, Hui Wang, Chong-Wah Ngo

On the one hand, the approximate k-nearest neighbor graph construction is treated as a search task.

graph construction Information Retrieval +1

Paper
Add Code

On the Selection of Anchors and Targets for Video Hyperlinking

no code implementations • 14 Apr 2018 • Zhi-Qi Cheng, Hao Zhang, Xiao Wu, Chong-Wah Ngo

A principle way of hyperlinking can be carried out by picking centers of clusters as anchors and from there reach out to targets within or outside of clusters with consideration of neighborhood complexity.

Paper
Add Code

Exploring Object Relation in Mean Teacher for Cross-Domain Detection

1 code implementation • CVPR 2019 • Qi Cai, Yingwei Pan, Chong-Wah Ngo, Xinmei Tian, Ling-Yu Duan, Ting Yao

The whole architecture is then optimized with three consistency regularizations: 1) region-level consistency to align the region-level predictions between teacher and student, 2) inter-graph consistency for matching the graph structures between teacher and student, and 3) intra-graph consistency to enhance the similarity between regions of same class within the graph of student.

Relation Unsupervised Domain Adaptation

Paper
Code

Transferrable Prototypical Networks for Unsupervised Domain Adaptation

no code implementations • CVPR 2019 • Yingwei Pan, Ting Yao, Yehao Li, Yu Wang, Chong-Wah Ngo, Tao Mei

Specifically, we present Transferrable Prototypical Networks (TPN) for adaptation such that the prototypes for each class in source and target domains are close in the embedding space and the score distributions predicted by prototypes separately on source and target data are similar.

Pseudo Label Unsupervised Domain Adaptation

Paper
Add Code

Learning Spatio-Temporal Representation with Local and Global Diffusion

no code implementations • CVPR 2019 • Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Xinmei Tian, Tao Mei

Diffusions effectively interact two aspects of information, i. e., localized and holistic, for more powerful way of representation learning.

Ranked #8 on Action Recognition on UCF101

Action Classification Action Detection +5

Paper
Add Code

vireoJD-MM at Activity Detection in Extended Videos

no code implementations • 20 Jun 2019 • Fuchen Long, Qi Cai, Zhaofan Qiu, Zhijian Hou, Yingwei Pan, Ting Yao, Chong-Wah Ngo

This notebook paper presents an overview and comparative analysis of our system designed for activity detection in extended videos (ActEV-PC) in ActivityNet Challenge 2019.

Action Detection Action Localization +1

Paper
Add Code

On the Merge of k-NN Graph

1 code implementation • 2 Aug 2019 • Wan-Lei Zhao, Hui Wang, Peng-Cheng Lin, Chong-Wah Ngo

Unfortunately, a closely related issue of how to merge two existing k-NN graphs has been overlooked.

graph construction Information Retrieval +1

Paper
Code

Deeply Activated Salient Region for Instance Search

no code implementations • 1 Feb 2020 • Hui-Chu Xiao, Wan-Lei Zhao, Jie Lin, Chong-Wah Ngo

Due to the lack of proper mechanism in locating instances and deriving feature representation, instance search is generally only effective for retrieving instances of known object categories.

Image Retrieval Instance Search

Paper
Add Code

k-sums: another side of k-means

1 code implementation • 19 May 2020 • Wan-Lei Zhao, Run-Qing Chen, Hui Ye, Chong-Wah Ngo

This optimization procedure converges faster to a better local minimum over k-means and many of its variants.

Clustering Stochastic Optimization

Paper
Code

Transferring and Regularizing Prediction for Semantic Segmentation

no code implementations • CVPR 2020 • Yiheng Zhang, Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Dong Liu, Tao Mei

In the view of extremely expensive expert labeling, recent research has shown that the models trained on photo-realistic synthetic data (e. g., computer games) with computer-generated annotations can be adapted to real images.

Ranked #17 on Domain Adaptation on SYNTHIA-to-Cityscapes

Domain Adaptation Segmentation +1

Paper
Add Code

Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation

no code implementations • CVPR 2020 • Yingwei Pan, Ting Yao, Yehao Li, Chong-Wah Ngo, Tao Mei

A clustering branch is capitalized on to ensure that the learnt representation preserves such underlying structure by matching the estimated assignment distribution over clusters to the inherent cluster distribution for each target sample.

Clustering Unsupervised Domain Adaptation

Paper
Add Code

Multi-modal Cooking Workflow Construction for Food Recipes

no code implementations • 20 Aug 2020 • Liangming Pan, Jingjing Chen, Jianlong Wu, Shaoteng Liu, Chong-Wah Ngo, Min-Yen Kan, Yu-Gang Jiang, Tat-Seng Chua

Understanding food recipe requires anticipating the implicit causal effects of cooking actions, such that the recipe can be converted into a graph describing the temporal workflow of the recipe.

Common Sense Reasoning

Paper
Add Code

Pyramid Fusion Dark Channel Prior for Single Image Dehazing

no code implementations • 21 May 2021 • Qiyuan Liang, Bin Zhu, Chong-Wah Ngo

In this paper, we propose the pyramid fusion dark channel prior (PF-DCP) for single image dehazing.

Image Dehazing Single Image Dehazing

Paper
Add Code

Token Shift Transformer for Video Classification

3 code implementations • 5 Aug 2021 • Hao Zhang, Yanbin Hao, Chong-Wah Ngo

It is worth noticing that our TokShift transformer is a pure convolutional-free video transformer pilot with computational efficiency for video understanding.

Classification Computational Efficiency +2

Paper
Code

CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval

1 code implementation • 21 Sep 2021 • Zhijian Hou, Chong-Wah Ngo, Wing Kwong Chan

This task is essential because advanced video retrieval applications should enable users to retrieve a precise moment from a large video corpus.

Ranked #1 on Video Corpus Moment Retrieval on TVR

Corpus Video Moment Retrieval Moment Retrieval +6

Paper
Code

Condensing a Sequence to One Informative Frame for Video Recognition

no code implementations • ICCV 2021 • Zhaofan Qiu, Ting Yao, Yan Shu, Chong-Wah Ngo, Tao Mei

This paper studies a two-step alternative that first condenses the video sequence to an informative "frame" and then exploits off-the-shelf image recognition system on the synthetic frame.

Motion Estimation valid +1

Paper
Add Code

Boosting Video Representation Learning with Multi-Faceted Integration

no code implementations • CVPR 2021 • Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Xiao-Ping Zhang, Dong Wu, Tao Mei

Video content is multifaceted, consisting of objects, scenes, interactions or actions.

Action Recognition Representation Learning +1

Paper
Add Code

Optimization Planning for 3D ConvNets

1 code implementation • 11 Jan 2022 • Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Tao Mei

In this paper, we decompose the path into a series of training "states" and specify the hyper-parameters, e. g., learning rate and the length of input clips, in each state.

Video Recognition

Paper
Code

Group Contextualization for Video Recognition

1 code implementation • CVPR 2022 • Yanbin Hao, Hao Zhang, Chong-Wah Ngo, Xiangnan He

By utilizing calibrators to embed feature with four different kinds of contexts in parallel, the learnt representation is expected to be more resilient to diverse types of activities.

Ranked #3 on Egocentric Activity Recognition on EGTEA

Action Recognition Egocentric Activity Recognition +1

Paper
Code

Adaptive Split-Fusion Transformer

1 code implementation • 26 Apr 2022 • Zixuan Su, Hao Zhang, Jingjing Chen, Lei Pang, Chong-Wah Ngo, Yu-Gang Jiang

Neural networks for visual content understanding have recently evolved from convolutional ones (CNNs) to transformers.

Ranked #1 on Image Classification on CIFAR-10 Image Classification

Image Classification

Paper
Code

Cross-lingual Adaptation for Recipe Retrieval with Mixup

no code implementations • 8 May 2022 • Bin Zhu, Chong-Wah Ngo, Jingjing Chen, Wing-Kwong Chan

To bridge the domain gap, recipe mixup loss is proposed to enforce the intermediate domain to locate in the shortest geodesic path between source and target domains in the recipe embedding space.

Retrieval Unsupervised Domain Adaptation

Paper
Add Code

MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing

no code implementations • CVPR 2022 • Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Tao Mei

By deriving the novel grouped time mixing (GTM) operations, we equip the basic token-mixing MLP with the ability of temporal modeling.

Ranked #21 on Action Recognition on Something-Something V1

3D Architecture Action Classification +2

Paper
Add Code

(Un)likelihood Training for Interpretable Embedding

1 code implementation • 1 Jul 2022 • Jiaxin Wu, Chong-Wah Ngo, Wing-Kwong Chan, Zhijian Hou

Cross-modal representation learning has become a new normal for bridging the semantic gap between text and visual data.

Ad-hoc video search Representation Learning +2

Paper
Code

Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning

2 code implementations • 11 Jul 2022 • Ting Yao, Yingwei Pan, Yehao Li, Chong-Wah Ngo, Tao Mei

Motivated by the wavelet theory, we construct a new Wavelet Vision Transformer (\textbf{Wave-ViT}) that formulates the invertible down-sampling with wavelet transforms and self-attention learning in a unified way.

Ranked #209 on Image Classification on ImageNet

Image Classification Instance Segmentation +4

2,986

Paper
Code

Long-term Leap Attention, Short-term Periodic Shift for Video Classification

1 code implementation • 12 Jul 2022 • Hao Zhang, Lechao Cheng, Yanbin Hao, Chong-Wah Ngo

By replacing a vanilla 2D attention with the LAPS, we could adapt a static transformer into a video one, with zero extra parameters and neglectable computation overhead ($\sim$2. 6\%).

Video Classification

Paper
Code

CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding

1 code implementation • 22 Sep 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

This paper tackles an emerging and challenging problem of long video temporal grounding~(VTG) that localizes video moments related to a natural language (NL) query.

Contrastive Learning Video Grounding

Paper
Code

Dynamic Temporal Filtering in Video Models

1 code implementation • 15 Nov 2022 • Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Chong-Wah Ngo, Tao Mei

The pre-determined kernel size severely limits the temporal receptive fields and the fixed weights treat each spatial location across frames equally, resulting in sub-optimal solution for long-range temporal modeling in natural scenes.

Paper
Code

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

no code implementations • 16 Nov 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

This technical report describes the CONE approach for Ego4D Natural Language Queries (NLQ) Challenge in ECCV 2022.

Contrastive Learning Natural Language Queries

Paper
Add Code

ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion

no code implementations • ICCV 2023 • Qi Cai, Yingwei Pan, Ting Yao, Chong-Wah Ngo, Tao Mei

Recent progress on multi-modal 3D object detection has featured BEV (Bird-Eye-View) based fusion, which effectively unifies both LiDAR point clouds and camera images in a shared BEV space.

3D Object Detection Depth Estimation +2

Paper
Add Code

Interactive Video Corpus Moment Retrieval using Reinforcement Learning

no code implementations • 19 Feb 2023 • Zhixin Ma, Chong-Wah Ngo

Nevertheless, when the first few pages of results are swamped with visually similar items, or the search target is hidden deep in the ranked list, finding the know-item target usually requires a long duration of browsing and result inspection.

Moment Retrieval reinforcement-learning +3

Paper
Add Code

GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

1 code implementation • 27 Jun 2023 • Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou

Motivated by this, we leverage a two-stage pre-training strategy to train egocentric feature extractors and the grounding model on video narrations, and further fine-tune the model on annotated data.

Natural Language Queries

Paper
Code

Incremental Learning on Food Instance Segmentation

no code implementations • 28 Jun 2023 • Huu-Thanh Nguyen, Yu Cao, Chong-Wah Ngo, Wing-Kwong Chan

The power of the framework is a novel difficulty assessment model, which forecasts how challenging an unlabelled sample is to the latest trained instance segmentation model.

Incremental Learning Instance Segmentation +2

Paper
Add Code

FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

no code implementations • 22 Dec 2023 • Yuehao Yin, Huiyan Qi, Bin Zhu, Jingjing Chen, Yu-Gang Jiang, Chong-Wah Ngo

In the second stage, we construct a multi-round conversation dataset and a reasoning segmentation dataset to fine-tune the model, enabling it to conduct professional dialogues and generate segmentation masks based on complex reasoning in the food domain.

Food Recognition Multi-Task Learning +3

Paper
Add Code

Interpretable Embedding for Ad-hoc Video Search

1 code implementation • 19 Feb 2024 • Jiaxin Wu, Chong-Wah Ngo

Answering query with semantic concepts has long been the mainstream approach for video search.

Ad-hoc video search

Paper
Code

OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation

no code implementations • 1 Apr 2024 • Xiongwei Wu, Sicheng Yu, Ee-Peng Lim, Chong-Wah Ngo

The pre-training phase equips FoodLearner with the capability to align visual information with corresponding textual representations that are specifically related to food, while the second phase adapts both the FoodLearner and the Image-Informed Text Encoder for the segmentation task.

Image Segmentation Segmentation +1

Paper
Add Code

Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank

no code implementations • 9 Apr 2024 • Jiaxin Wu, Chong-Wah Ngo, Wing-Kwong Chan

Experimental results show that the integration of the above-proposed elements doubles the R@1 performance of the AVS method on the MSRVTT dataset and improves the xinfAP on the TRECVid AVS query sets for 2016-2023 (eight years) by a margin from 2% to 77%, with an average about 20%.

Ad-hoc video search

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.