Search Results for author: Sai-Kit Yeung

Found 54 papers, 18 papers with code

Improving Referring Image Segmentation using Vision-Aware Text Features

no code implementations12 Apr 2024 Hai Nguyen-Truong, E-Ro Nguyen, Tuan-Anh Vu, Minh-Triet Tran, Binh-Son Hua, Sai-Kit Yeung

Our method involves using CLIP to derive a CLIP Prior that integrates an object-centric visual heatmap with text description, which can be used as the initial query in DETR-based architecture for the segmentation task.

Image Segmentation Segmentation +1

OmniGS: Omnidirectional Gaussian Splatting for Fast Radiance Field Reconstruction using Omnidirectional Images

no code implementations4 Apr 2024 Longwei Li, Huajian Huang, Sai-Kit Yeung, Hui Cheng

In this paper, we present OmniGS, a novel omnidirectional Gaussian splatting system, to take advantage of omnidirectional images for fast radiance field reconstruction.

Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention

no code implementations25 Jan 2024 Quang-Trung Truong, Duc Thanh Nguyen, Binh-Son Hua, Sai-Kit Yeung

This is enabled by deformable attention mechanism, where the keys and values capturing the memory of a video sequence in the attention module have flexible locations updated across frames.

Ranked #7 on Unsupervised Video Object Segmentation on DAVIS 2016 val (using extra training data)

Knowledge Distillation Object +5

Exploring Boundary of GPT-4V on Marine Analysis: A Preliminary Case Study

no code implementations4 Jan 2024 Ziqiang Zheng, YiWei Chen, Jipeng Zhang, Tuan-Anh Vu, Huimin Zeng, Yue Him Wong Tim, Sai-Kit Yeung

In this study, we carry out the preliminary and comprehensive case study of utilizing GPT-4V for marine analysis.

Leveraging Open-Vocabulary Diffusion to Camouflaged Instance Segmentation

no code implementations29 Dec 2023 Tuan-Anh Vu, Duc Thanh Nguyen, Qing Guo, Binh-Son Hua, Nhat Minh Chung, Ivor W. Tsang, Sai-Kit Yeung

Such cross-domain representations are desirable in segmenting camouflaged objects where visual cues are subtle to distinguish the objects from the background, especially in segmenting novel objects which are not seen in training.

Instance Segmentation Segmentation +1

Advances in 3D Neural Stylization: A Survey

1 code implementation30 Nov 2023 Yingshu Chen, Guocheng Shao, Ka Chun Shum, Binh-Son Hua, Sai-Kit Yeung

Building on such taxonomy, our survey first revisits the background of neural stylization on 2D images, and then provides in-depth discussions on recent neural stylization methods for 3D data, where we also provide a mini-benchmark on artistic stylization methods.

Neural Stylization Style Transfer

360Loc: A Dataset and Benchmark for Omnidirectional Visual Localization with Cross-device Queries

no code implementations29 Nov 2023 Huajian Huang, Changkun Liu, Yipeng Zhu, Hui Cheng, Tristan Braud, Sai-Kit Yeung

We propose a virtual camera approach to generate lower-FoV query frames from 360$^\circ$ images, which ensures a fair comparison of performance among different query types in visual localization tasks.

Visual Localization

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras

1 code implementation28 Nov 2023 Huajian Huang, Longwei Li, Hui Cheng, Sai-Kit Yeung

In addition to actively densifying hyper primitives based on geometric features, we further introduce a Gaussian-Pyramid-based training method to progressively learn multi-level features, enhancing photorealistic mapping performance.

Neural Rendering

Test-Time Augmentation for 3D Point Cloud Classification and Segmentation

no code implementations22 Nov 2023 Tuan-Anh Vu, Srinjay Sarkar, Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung

We are inspired by the recent revolution of learning implicit representation and point cloud upsampling, which can produce high-quality 3D surface reconstruction and proximity-to-surface, respectively.

3D Point Cloud Classification Data Augmentation +3

MarineGPT: Unlocking Secrets of Ocean to the Public

1 code implementation20 Oct 2023 Ziqiang Zheng, Jipeng Zhang, Tuan-Anh Vu, Shizhe Diao, Yue Him Wong Tim, Sai-Kit Yeung

Large language models (LLMs), such as ChatGPT/GPT-4, have proven to be powerful tools in promoting the user experience as an AI assistant.

Language Modelling

CoralVOS: Dataset and Benchmark for Coral Video Segmentation

no code implementations3 Oct 2023 Zheng Ziqiang, Xie Yaofeng, Liang Haixin, Yu Zhibin, Sai-Kit Yeung

We perform experiments on our CoralVOS dataset, including 6 recent state-of-the-art video object segmentation (VOS) algorithms.

Segmentation Semantic Segmentation +3

Language-driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates

1 code implementation20 Sep 2023 Ka Chun Shum, Jaeyeon Kim, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

Specifically, to insert a new foreground object represented by a set of multi-view images into a background radiance field, we use a text-to-image diffusion model to learn and generate combined images that fuse the object of interest into the given background across views.

3D Reconstruction Object +1

360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking

1 code implementation ICCV 2023 Huajian Huang, Yinzhe Xu, Yingshu Chen, Sai-Kit Yeung

360{\deg} images can provide an omnidirectional field of view which is important for stable and long-term scene perception.

Visual Object Tracking

MarineVRS: Marine Video Retrieval System with Explainability via Semantic Understanding

no code implementations7 Jun 2023 Tan-Sang Ha, Hai Nguyen-Truong, Tuan-Anh Vu, Sai-Kit Yeung

Building a video retrieval system that is robust and reliable, especially for the marine environment, is a challenging task due to several factors such as dealing with massive amounts of dense and repetitive data, occlusion, blurriness, low lighting conditions, and abstract queries.

Retrieval Sentence +1

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

no code implementations24 Nov 2022 Benjamin Kiefer, Matej Kristan, Janez Perš, Lojze Žust, Fabio Poiesi, Fabio Augusto de Alcantara Andrade, Alexandre Bernardino, Matthew Dawkins, Jenni Raitoharju, Yitong Quan, Adem Atmaca, Timon Höfer, Qiming Zhang, Yufei Xu, Jing Zhang, DaCheng Tao, Lars Sommer, Raphael Spraul, Hangyue Zhao, Hongpu Zhang, Yanyun Zhao, Jan Lukas Augustin, Eui-ik Jeon, Impyeong Lee, Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Sagar Verma, Siddharth Gupta, Shishir Muralidhara, Niharika Hegde, Daitao Xing, Nikolaos Evangeliou, Anthony Tzes, Vojtěch Bartl, Jakub Špaňhel, Adam Herout, Neelanjan Bhowmik, Toby P. Breckon, Shivanand Kundargi, Tejas Anvekar, Chaitra Desai, Ramesh Ashok Tabib, Uma Mudengudi, Arpita Vats, Yang song, Delong Liu, Yonglin Li, Shuman Li, Chenhao Tan, Long Lan, Vladimir Somers, Christophe De Vleeschouwer, Alexandre Alahi, Hsiang-Wei Huang, Cheng-Yen Yang, Jenq-Neng Hwang, Pyong-Kun Kim, Kwangju Kim, Kyoungoh Lee, Shuai Jiang, Haiwen Li, Zheng Ziqiang, Tuan-Anh Vu, Hai Nguyen-Truong, Sai-Kit Yeung, Zhuang Jia, Sophia Yang, Chih-Chung Hsu, Xiu-Yu Hou, Yu-An Jhang, Simon Yang, Mau-Tsuen Yang

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection.

Object object-detection +2

Marine Video Kit: A New Marine Video Dataset for Content-based Analysis and Retrieval

1 code implementation23 Sep 2022 Quang-Trung Truong, Tuan-Anh Vu, Tan-Sang Ha, Lokoc Jakub, Yue Him Wong Tim, Ajay Joneja, Sai-Kit Yeung

It is important to remember that domain specific data may be noisier (e. g., endoscopic or underwater videos) and often require more experienced users for effective search.

 Ranked #1 on Retrieval on MVK

Retrieval Video Retrieval

Time-of-Day Neural Style Transfer for Architectural Photographs

1 code implementation13 Sep 2022 Yingshu Chen, Tuan-Anh Vu, Ka-Chun Shum, Binh-Son Hua, Sai-Kit Yeung

Architectural photography is a genre of photography that focuses on capturing a building or structure in the foreground with dramatic lighting in the background.

Image-to-Image Translation Style Transfer +1

360Roam: Real-Time Indoor Roaming Using Geometry-Aware 360$^\circ$ Radiance Fields

no code implementations4 Aug 2022 Huajian Huang, Yingshu Chen, Tianjia Zhang, Sai-Kit Yeung

Subsequently, we assign local radiance fields through an adaptive divide-and-conquer strategy based on the recovered geometry.

Novel View Synthesis

RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds with Cross-Attention Spatio-Temporal Features

1 code implementation30 Mar 2022 Tuan-Anh Vu, Duc Thanh Nguyen, Binh-Son Hua, Quang-Hieu Pham, Sai-Kit Yeung

The key insight is simultaneously performing both tasks via learning of spatial and temporal features from a sequence of point clouds can leverage individual tasks, leading to improved overall performance.

3D Human Reconstruction 3D Reconstruction +3

RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning

1 code implementation26 Feb 2022 Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung

3D point clouds deep learning is a promising field of research that allows a neural network to learn features of point clouds directly, making it a robust tool for solving 3D scene understanding tasks.

3D Point Cloud Classification Point Cloud Segmentation +2

ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based Image Retrieval

no code implementations24 Nov 2021 Hao Ren, Ziqiang Zheng, Yang Wu, Hong Lu, Yang Yang, Ying Shan, Sai-Kit Yeung

The huge domain gap between sketches and photos and the highly abstract sketch representations pose challenges for sketch-based image retrieval (\underline{SBIR}).

Retrieval Sketch-Based Image Retrieval

Neural Scene Decoration from a Single Photograph

1 code implementation4 Aug 2021 Hong-Wing Pang, Yingshu Chen, Phuoc-Hieu Le, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

In this paper, we introduce a new problem of domain-specific indoor scene image synthesis, namely neural scene decoration.

Image Generation Scene Generation

Dual-SLAM: A framework for robust single camera navigation

no code implementations23 Sep 2020 Huajian Huang, Wen-Yan Lin, Siying Liu, Dong Zhang, Sai-Kit Yeung

As local pose estimation is ill-conditioned, local pose estimation failures happen regularly, making the overall SLAM system brittle.

Pose Estimation Simultaneous Localization and Mapping

Minimal Adversarial Examples for Deep Learning on 3D Point Clouds

no code implementations ICCV 2021 Jaeyeon Kim, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

With recent developments of convolutional neural networks, deep learning for 3D point clouds has shown significant progress in various 3D scene understanding tasks, e. g., object recognition, semantic segmentation.

3D Object Recognition Object Detection +3

Global Context Aware Convolutions for 3D Point Cloud Understanding

no code implementations7 Aug 2020 Zhiyuan Zhang, Binh-Son Hua, Wei Chen, Yibin Tian, Sai-Kit Yeung

We found that a key reason is that compared to point coordinates, rotation-invariant features consumed by point cloud convolution are not as distinctive.

Point Cloud Classification Retrieval +1

LCD: Learned Cross-Domain Descriptors for 2D-3D Matching

1 code implementation21 Nov 2019 Quang-Hieu Pham, Mikaela Angelina Uy, Binh-Son Hua, Duc Thanh Nguyen, Gemma Roig, Sai-Kit Yeung

In this work, we present a novel method to learn a local cross-domain descriptor for 2D image and 3D point cloud matching.

3D Point Cloud Matching Depth Estimation +1

Rotation Invariant Convolutions for 3D Point Clouds Deep Learning

1 code implementation17 Aug 2019 Zhiyuan Zhang, Binh-Son Hua, David W. Rosen, Sai-Kit Yeung

Our core idea is to use low-level rotation invariant geometric features such as distances and angles to design a convolution operator for point cloud learning.

Scene Understanding Translation

ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics

1 code implementation ICCV 2019 Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung

Deep learning with 3D data has progressed significantly since the introduction of convolutional neural networks that can handle point order ambiguity in point cloud data.

3D Point Cloud Classification 3D Semantic Segmentation +2

Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

1 code implementation ICCV 2019 Mikaela Angelina Uy, Quang-Hieu Pham, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

From our comprehensive benchmark, we show that our dataset poses great challenges to existing point cloud classification techniques as objects from real-world scans are often cluttered with background and/or are partial due to occlusions.

3D Object Classification Classification +3

Uncalibrated Photometric Stereo Under Natural Illumination

no code implementations CVPR 2018 Zhipeng Mo, Boxin Shi, Feng Lu, Sai-Kit Yeung, Yasuyuki Matsushita

This paper presents a photometric stereo method that works with unknown natural illuminations without any calibration object.

Real-time Progressive 3D Semantic Segmentation for Indoor Scene

no code implementations1 Apr 2018 Quang-Hieu Pham, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

The widespread adoption of autonomous systems such as drones and assistant robots has created a need for real-time high-quality semantic scene segmentation.

3D Semantic Segmentation Clustering +2

Pointwise Convolutional Neural Networks

1 code implementation CVPR 2018 Binh-Son Hua, Minh-Khoi Tran, Sai-Kit Yeung

Deep learning with 3D data such as reconstructed point clouds and CAD models has received great research interests recently.

Object Object Recognition +2

Radiometric Calibration for Internet Photo Collections

no code implementations CVPR 2017 Zhipeng Mo, Boxin Shi, Sai-Kit Yeung, Yasuyuki Matsushita

Radiometrically calibrating the images from Internet photo collections brings photometric analysis from lab data to big image data in the wild, but conventional calibration methods cannot be directly applied to such image data.

A Field Model for Repairing 3D Shapes

no code implementations CVPR 2016 Duc Thanh Nguyen, Binh-Son Hua, Khoi Tran, Quang-Hieu Pham, Sai-Kit Yeung

The proposed method was evaluated on both artificial data and real data obtained from reconstruction of practical scenes.

A Benchmark Dataset and Evaluation for Non-Lambertian and Uncalibrated Photometric Stereo

no code implementations CVPR 2016 Boxin Shi, Zhe Wu, Zhipeng Mo, Dinglong Duan, Sai-Kit Yeung, Ping Tan

Recent progress on photometric stereo extends the technique to deal with general materials and unknown illumination conditions.

Towards Building an RGBD-M Scanner

no code implementations12 Mar 2016 Zhe Wu, Sai-Kit Yeung, Ping Tan

We present a portable device to capture both shape and reflectance of an indoor scene.

Segmentation Rectification for Video Cutout via One-Class Structured Learning

no code implementations16 Feb 2016 Junyan Wang, Sai-Kit Yeung, Jue Wang, Kun Zhou

Comprehensive experiments on both RGB and RGB-D data demonstrate that our simple and effective method significantly outperforms the segmentation propagation methods adopted in the state-of-the-art video cutout systems, and the results also suggest the potential usefulness of our method in image cutout system.

Segmentation

A Closed-Form Solution to Tensor Voting: Theory and Applications

no code implementations19 Jan 2016 Tai-Pang Wu, Sai-Kit Yeung, Jiaya Jia, Chi-Keung Tang, Gerard Medioni

We prove a closed-form solution to tensor voting (CFTV): given a point set in any dimensions, our closed-form solution provides an exact, continuous and efficient algorithm for computing a structure-aware tensor that simultaneously achieves salient structure detection and outlier attenuation.

Stereo Matching Stereo Matching Hand

Fill and Transfer: A Simple Physics-Based Approach for Containability Reasoning

no code implementations ICCV 2015 Lap-Fai Yu, Noah Duncan, Sai-Kit Yeung

We apply our approach to reason about the containability of several real-world objects acquired using a consumer-grade depth camera.

An MRF-Poselets Model for Detecting Highly Articulated Humans

no code implementations ICCV 2015 Duc Thanh Nguyen, Minh-Khoi Tran, Sai-Kit Yeung

The problem of human detection is then formulated as maximum a posteriori (MAP) estimation in the MRF model.

Human Detection

A Compact Linear Programming Relaxation for Binary Sub-modular MRF

no code implementations9 Apr 2014 Junyan Wang, Sai-Kit Yeung

We propose a novel compact linear programming (LP) relaxation for binary sub-modular MRF in the context of object segmentation.

Interactive Segmentation Segmentation +1

Shading-Based Shape Refinement of RGB-D Images

no code implementations CVPR 2013 Lap-Fai Yu, Sai-Kit Yeung, Yu-Wing Tai, Stephen Lin

We present a shading-based shape refinement algorithm which uses a noisy, incomplete depth map from Kinect to help resolve ambiguities in shape-from-shading.

Cannot find the paper you are looking for? You can Submit a new open access paper.