Search Results for author: Hong Liu

Found 164 papers, 89 papers with code

Dynamic Multistep Reasoning based on Video Scene Graph for Video Question Answering

no code implementations NAACL 2022 Jianguo Mao, Wenbin Jiang, Xiangdong Wang, Zhifan Feng, Yajuan Lyu, Hong Liu, Yong Zhu

Then, it performs multistep reasoning for better answer decision between the representations of the question and the video, and dynamically integrate the reasoning results.

Question Answering Video Question Answering +1

API-Net: Robust Generative Classifier via a Single Discriminator

1 code implementation ECCV 2020 Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Dual-Branch Graph Transformer Network for 3D Human Mesh Reconstruction from Video

1 code implementation2 Dec 2024 Tao Tang, Hong Liu, Yingxuan You, Ti Wang, Wenhao Li

DGTR employs a dual-branch network including a Global Motion Attention (GMA) branch and a Local Details Refine (LDR) branch to parallelly extract long-term dependencies and local crucial information, helping model global human motion and local human details (e. g., local motion, tiny movement).

CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search

no code implementations2 Dec 2024 Kaixin Wu, Yixin Ji, Zeyuan Chen, Qiang Wang, Cunxiang Wang, Hong Liu, Baijun Ji, Jia Xu, Zhongyi Liu, Jinjie Gu, Yuan Zhou, Linjian Mo

Our CPRM framework includes three modules: 1) employing both queries and multi-field item to jointly pre-train for enhancing domain knowledge, 2) applying in-context pre-training, a novel approach where LLMs are pre-trained on a sequence of related queries or items, and 3) conducting reading comprehension on items to produce associated domain knowledge and background information (e. g., generating summaries and corresponding queries) to further strengthen LLMs.

In-Context Learning Reading Comprehension

DoubleCCA: Improving Foundation Model Group Robustness with Random Sentence Embeddings

no code implementations25 Nov 2024 Hong Liu, Yitong Lu

Second, we use an additional sentence embedding model to generate different text embeddings with respect to these random sentences.

Sentence Sentence Embedding +1

D$^3$epth: Self-Supervised Depth Estimation with Dynamic Mask in Dynamic Scenes

no code implementations7 Nov 2024 Siyu Chen, Hong Liu, Wenhao Li, Ying Zhu, Guoquan Wang, Jianbing Wu

First, within the self-supervised framework, we design a reprojection constraint to identify regions likely to contain dynamic objects, allowing the construction of a dynamic mask that mitigates their impact at the loss level.

Monocular Depth Estimation

Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model

no code implementations29 Oct 2024 Yiming Ji, Yang Liu, Zhengpu Wang, Boyu Ma, Zongwu Xie, Hong Liu

Diffusion models have been shown to be able to learn the distribution relationships between features in RGB images, and thus generate new realistic images. In this work, we propose a new approach to solving the ObjectNav task, by training a diffusion model to learn the statistical distribution patterns of objects in semantic maps, and using the map of the explored regions during navigation as the condition to generate the map of the unknown regions, thereby realizing the semantic reasoning of the target object, i. e., diffusion as reasoning (DAR).

Common Sense Reasoning Navigate

DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality Reduction

1 code implementation25 Oct 2024 Zelin Zang, Yuhao Wang, Jinlin Wu, Hong Liu, Yue Shen, Stan. Z Li, Zhen Lei

DMT-HI enhances DR accuracy by leveraging hyperbolic embeddings to represent the hierarchical nature of data, while also improving interpretability by explicitly linking input data, embedding outcomes, and key features through the MOE structure.

Dimensionality Reduction

ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos

1 code implementation21 Oct 2024 Tao Tang, Hong Liu, Yingxuan You, Ti Wang, Wenhao Li

Then, to fully utilize these representations, we introduce a semi-analytical regressor to estimate the parameters of the human mesh model.

Disentanglement Human Mesh Recovery +2

STNet: Deep Audio-Visual Fusion Network for Robust Speaker Tracking

1 code implementation8 Oct 2024 Yidi Li, Hong Liu, Bing Yang

Audio-visual speaker tracking aims to determine the location of human targets in a scene using signals captured by a multi-sensor platform, whose accuracy and robustness can be improved by multi-modal fusion methods.

DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion

1 code implementation6 Oct 2024 Ke Sun, Shen Chen, Taiping Yao, Hong Liu, Xiaoshuai Sun, Shouhong Ding, Rongrong Ji

The rapid progress of Deepfake technology has made face swapping highly realistic, raising concerns about the malicious use of fabricated facial content.

DeepFake Detection Domain Generalization +1

A Multimodal Object-level Contrast Learning Method for Cancer Survival Risk Prediction

1 code implementation3 Sep 2024 Zekang Yang, Hong Liu, Xiangdong Wang

In this paper, we propose a new training method, multimodal object-level contrast learning, for cancer survival risk prediction.

Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities

no code implementations26 Aug 2024 Yidi Li, Yihan Li, Yixin Guo, Bin Ren, Zhenhuan Xu, Hao Guo, Hong Liu, Nicu Sebe

By transferring knowledge from teacher to student, the student network can better adapt to complex dynamic scenes with incomplete observations.

Generative Adversarial Network

ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models

1 code implementation16 Aug 2024 Chao Zeng, Songwei Liu, Yusheng Xie, Hong Liu, Xiaojian Wang, Miao Wei, Shu Yang, Fangmin Chen, Xing Mei

Based on W2*A8 quantization configuration on LLaMA-7B model, it achieved a WikiText2 perplexity of 7. 59 (2. 17$\downarrow $ vs 9. 76 in AffineQuant).

Model Compression Quantization

Advancing Multi-grained Alignment for Contrastive Language-Audio Pre-training

1 code implementation15 Aug 2024 Yiming Li, Zhifang Guo, Xiangdong Wang, Hong Liu

Recent advances have been witnessed in audio-language joint learning, such as CLAP, that shows much success in multi-modal understanding tasks.

cross-modal alignment

ClickDiff: Click to Induce Semantic Contact Map for Controllable Grasp Generation with Diffusion Models

1 code implementation28 Jul 2024 Peiming Li, Ziyi Wang, Mengyuan Liu, Hong Liu, Chen Chen

To address these challenges, we propose a controllable grasp generation task and introduce ClickDiff, a controllable conditional generation model that leverages a fine-grained Semantic Contact Map (SCM).

Controllable Grasp Generation Object

USD: Unsupervised Soft Contrastive Learning for Fault Detection in Multivariate Time Series

1 code implementation25 May 2024 Hong Liu, Xiuxiu Qiu, Yiming Shi, Zelin Zang

Unsupervised fault detection in multivariate time series is critical for maintaining the integrity and efficiency of complex systems, with current methodologies largely focusing on statistical and machine learning techniques.

Contrastive Learning Data Augmentation +3

Learning Social Graph for Inactive User Recommendation

1 code implementation8 May 2024 Nian Liu, Shen Fan, Ting Bai, Peng Wang, Mingwei Sun, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Chuan Shi

In this paper, we propose a novel social recommendation method called LSIR (\textbf{L}earning \textbf{S}ocial Graph for \textbf{I}nactive User \textbf{R}ecommendation) that learns an optimal social graph structure for social recommendation, especially for inactive users.

Graph structure learning Recommendation Systems

Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making

no code implementations30 Apr 2024 Lei Zhuang, Jingdong Zhao, Yuntao Li, Zichun Xu, Liangliang Zhao, Hong Liu

EISE and MPT are collaboratively trained, enabling EISE to autonomously learn and extract patterns from environmental data, thereby forming semantic representations that MPT could more effectively interpret and utilize for motion planning.

Decision Making Motion Planning

MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions

1 code implementation21 Apr 2024 Sheng Yan, Mengyuan Liu, Yong Wang, Yang Liu, Chen Chen, Hong Liu

In this paper, we address the unexplored question of temporal sentence localization in human motions (TSLM), aiming to locate a target moment from a 3D human motion that semantically corresponds to a text query.

Moment Retrieval Sentence

Federated Modality-specific Encoders and Multimodal Anchors for Personalized Brain Tumor Segmentation

1 code implementation18 Mar 2024 Qian Dai, Dong Wei, Hong Liu, Jinghan Sun, Liansheng Wang, Yefeng Zheng

In practice, it is not uncommon that some FL participants only possess a subset of the complete imaging modalities, posing inter-modal heterogeneity as a challenge to effectively training a global model on all participants' data.

Brain Tumor Segmentation Federated Learning +2

AVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Adversarial Visual-Instructions

no code implementations14 Mar 2024 Hao Zhang, Wenqi Shao, Hong Liu, Yongqiang Ma, Ping Luo, Yu Qiao, Kaipeng Zhang

To bridge this gap, we introduce AVIBench, a framework designed to analyze the robustness of LVLMs when facing various adversarial visual-instructions (AVIs), including four types of image-based AVIs, ten types of text-based AVIs, and nine types of content bias AVIs (such as gender, violence, cultural, and racial biases, among others).

Fairness Language Modelling

WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images

1 code implementation14 Mar 2024 Hong Liu, Haosen Yang, Paul J. van Diest, Josien P. W. Pluim, Mitko Veta

In particular, our model outperforms SAM by 4. 1 and 2. 5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task (CAMELYON16 dataset).

Decoder Segmentation +2

Identity-aware Dual-constraint Network for Cloth-Changing Person Re-identification

no code implementations13 Mar 2024 Peini Guo, Mengyuan Liu, Hong Liu, Ruijia Fan, Guoquan Wang, Bin He

In addition, a Multi-scale Constraint Block (MCB) is designed, which extracts fine-grained identity-related features and effectively transfers cloth-irrelevant knowledge.

Cloth-Changing Person Re-Identification counterfactual

Position: Towards Implicit Prompt For Text-To-Image Models

no code implementations4 Mar 2024 Yue Yang, Yuqi Lin, Hong Liu, Wenqi Shao, Runjian Chen, Hailong Shang, Yu Wang, Yu Qiao, Kaipeng Zhang, Ping Luo

We call for increased attention to the potential and risks of implicit prompts in the T2I community and further investigation into the capabilities and impacts of implicit prompts, advocating for a balanced approach that harnesses their benefits while mitigating their risks.

Position

Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

no code implementations20 Feb 2024 Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma

Given input length $n$, previous works have shown that constant-depth transformers with finite precision $\mathsf{poly}(n)$ embedding size can only solve problems in $\mathsf{TC}^0$ without CoT.

Decoder

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

1 code implementation11 Feb 2024 Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi

Large language models (LLMs) like ChatGPT, exhibit powerful zero-shot and instruction-following capabilities, have catalyzed a revolutionary transformation across diverse fields, especially for open-ended tasks.

Graph Question Answering Instruction Following +4

Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation

no code implementations4 Feb 2024 Ti Wang, Mengyuan Liu, Hong Liu, Bin Ren, Yingxuan You, Wenhao Li, Nicu Sebe, Xia Li

We observe that previous optimization-based methods commonly rely on projection constraint, which only ensures alignment in 2D space, potentially leading to the overfitting problem.

3D Human Pose Estimation

Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition

no code implementations4 Feb 2024 Mengyuan Liu, Chen Chen, Songtao Wu, Fanyang Meng, Hong Liu

Recognizing interactive actions, including hand-to-hand interaction and human-to-human interaction, has attracted increasing attention for various applications in the field of video analysis and human-robot interaction.

Action Recognition Human Interaction Recognition

GraphGPT: Graph Learning with Generative Pre-trained Transformers

1 code implementation31 Dec 2023 Qifang Zhao, Weidong Ren, Tianyu Li, Xiaoxiao Xu, Hong Liu

We introduce \textit{GraphGPT}, a novel model for Graph learning by self-supervised Generative Pre-training Transformers.

Decoder Graph Learning +1

Near-Optimal Resilient Aggregation Rules for Distributed Learning Using 1-Center and 1-Mean Clustering with Outliers

1 code implementation20 Dec 2023 Yuhao Yi, Ronghui You, Hong Liu, Changxin Liu, YuAn Wang, Jiancheng Lv

Our analysis show that constant approximations to the 1-center and 1-mean clustering problems with outliers provide near-optimal resilient aggregators for metric-based criteria, which have been proven to be crucial in the homogeneous and heterogeneous cases respectively.

Clustering Image Classification

Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D Networks for 3D Coherent Layer Segmentation of Retinal OCT Images with Full and Sparse Annotations

1 code implementation4 Dec 2023 Hong Liu, Dong Wei, Donghuan Lu, Xiaoying Tang, Liansheng Wang, Yefeng Zheng

Experiments on a synthetic dataset and three public clinical datasets show that our framework can effectively align the B-scans for potential motion correction, and achieves superior performance to state-of-the-art 2D deep learning methods in terms of both layer segmentation accuracy and cross-B-scan 3D continuity in both fully and semi-supervised settings, thus offering more clinical values than previous works.

Segmentation

Prompt Pool based Class-Incremental Continual Learning for Dialog State Tracking

1 code implementation17 Nov 2023 Hong Liu, Yucheng Cai, Yuan Zhou, Zhijian Ou, Yi Huang, Junlan Feng

Inspired by the recently emerging prompt tuning method that performs well on dialog systems, we propose to use the prompt pool method, where we maintain a pool of key-value paired prompts and select prompts from the pool according to the distance between the dialog history and the prompt keys.

Continual Learning dialog state tracking

Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning

1 code implementation20 Sep 2023 Chen Jiang, Hong Liu, Xuzheng Yu, Qing Wang, Yuan Cheng, Jia Xu, Zhongyi Liu, Qingpei Guo, Wei Chu, Ming Yang, Yuan Qi

We thereby present a new Triplet Partial Margin Contrastive Learning (TPM-CL) module to construct partial order triplet samples by automatically generating fine-grained hard negatives for matched text-video pairs.

Contrastive Learning Retrieval +4

Audio-free Prompt Tuning for Language-Audio Models

no code implementations15 Sep 2023 Yiming Li, Xiangdong Wang, Hong Liu

Contrastive Language-Audio Pretraining (CLAP) is pre-trained to associate audio features with human language, making it a natural zero-shot classifier to recognize unseen sound categories.

Semi-supervised Sound Event Detection with Local and Global Consistency Regularization

no code implementations15 Sep 2023 Yiming Li, Xiangdong Wang, Hong Liu, Rui Tao, Long Yan, Kazushige Ouchi

Then, the local consistency is adopted to encourage the model to leverage local features for frame-level predictions, and the global consistency is applied to force features to align with global prototypes through a specially designed contrastive loss.

Event Detection Sound Event Detection

Semantic-aware Consistency Network for Cloth-changing Person Re-Identification

1 code implementation27 Aug 2023 Peini Guo, Hong Liu, Jianbing Wu, Guoquan Wang, Tao Wang

Despite recent progress in CC-ReID, existing approaches are still hindered by the interference of clothing variations since they lack effective constraints to keep the model consistently focused on clothing-irrelevant regions.

Cloth-Changing Person Re-Identification

Audio Generation with Multiple Conditional Diffusion Model

no code implementations23 Aug 2023 Zhifang Guo, Jianguo Mao, Rui Tao, Long Yan, Kazushige Ouchi, Hong Liu, Xiangdong Wang

To address this issue, we propose a novel model that enhances the controllability of existing pre-trained text-to-audio models by incorporating additional conditions including content (timestamp) and style (pitch contour and energy contour) as supplements to the text.

Audio Generation Diversity +2

Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video

1 code implementation ICCV 2023 Yingxuan You, Hong Liu, Ti Wang, Wenhao Li, Runwei Ding, Xia Li

Despite significant progress in single image-based 3D human mesh recovery, accurately and smoothly recovering 3D human motion from a video remains challenging.

3D Human Pose Estimation Decoder +1

GraphDAC: A Graph-Analytic Approach to Dynamic Airspace Configuration

1 code implementation29 Jul 2023 Ke Feng, Dahai Liu, Yongxin Liu, Hong Liu, Houbing Song

The current National Airspace System (NAS) is reaching capacity due to increased air traffic, and is based on outdated pre-tactical planning.

Clustering

UniMatch: A Unified User-Item Matching Framework for the Multi-purpose Merchant Marketing

no code implementations19 Jul 2023 Qifang Zhao, Tianyu Li, Meng Du, Yu Jiang, Qinghui Sun, Zhongyao Wang, Hong Liu, Huan Xu

When doing private domain marketing with cloud services, the merchants usually have to purchase different machine learning models for the multiple marketing purposes, leading to a very high cost.

Marketing

You've Got Two Teachers: Co-evolutionary Image and Report Distillation for Semi-supervised Anatomical Abnormality Detection in Chest X-ray

no code implementations18 Jul 2023 Jinghan Sun, Dong Wei, Zhe Xu, Donghuan Lu, Hong Liu, Liansheng Wang, Yefeng Zheng

Inversely, we also use the prediction of the vision detection model for abnormality-guided pseudo classification label refinement (APCLR) in the auxiliary report classification task, and propose a co-evolution strategy where the vision and report models mutually promote each other with RPDLR and APCLR performed alternatively.

Anomaly Detection Pseudo Label

Cross-Model Cross-Stream Learning for Self-Supervised Human Action Recognition

2 code implementations15 Jul 2023 Mengyuan Liu, Hong Liu, Tianyu Guo

Inspired by SkeletonBYOL, this paper further presents a Cross-Model and Cross-Stream (CMCS) framework.

Contrastive Learning Ensemble Learning +7

A Gated Cross-domain Collaborative Network for Underwater Object Detection

1 code implementation25 Jun 2023 Linhui Dai, Hong Liu, Pinhao Song, Mengyuan Liu

Firstly, a real-time UIE method is employed to generate enhanced images, which can improve the visibility of objects in low-contrast areas.

object-detection Object Detection +1

FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation

no code implementations ICCV 2023 Jingwen Guo, Hong Liu, Shitong Sun, Tianyu Guo, Min Zhang, Chenyang Si

Existing skeleton-based action recognition methods typically follow a centralized learning paradigm, which can pose privacy concerns when exposing human-related videos.

Action Recognition Federated Learning +3

Revisiting and Advancing Adversarial Training Through A Simple Baseline

no code implementations13 Jun 2023 Hong Liu

In this paper, we delve into the essential components of adversarial training which is a pioneering defense technique against adversarial attacks.

Adversarial Defense Adversarial Robustness +1

Sparse-Inductive Generative Adversarial Hashing for Nearest Neighbor Search

no code implementations12 Jun 2023 Hong Liu

In this paper, we propose a novel unsupervised hashing method, termed Sparsity-Induced Generative Adversarial Hashing (SiGAH), to encode large-scale high-dimensional features into binary codes, which well solves the two problems through a generative adversarial training framework.

Quantization

STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection

1 code implementation CVPR 2023 Zhenglin Zhou, Huaxia Li, Hong Liu, Nanyang Wang, Gang Yu, Rongrong Ji

To solve this problem, we propose a Self-adapTive Ambiguity Reduction (STAR) loss by exploiting the properties of semantic ambiguity.

Face Alignment Facial Landmark Detection

Edge-guided Representation Learning for Underwater Object Detection

no code implementations1 Jun 2023 Linhui Dai, Hong Liu, Pinhao Song, Hao Tang, Runwei Ding, Shengquan Li

The key to addressing these challenges is to focus the model on obtaining more discriminative information.

Object object-detection +2

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

6 code implementations23 May 2023 Hong Liu, Zhiyuan Li, David Hall, Percy Liang, Tengyu Ma

Given the massive cost of language model pre-training, a non-trivial improvement of the optimization algorithm would lead to a material reduction on the time and cost of training.

Language Modelling Stochastic Optimization

Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition

1 code implementation22 May 2023 Hong Liu, Zhaobiao Lv, Zhijian Ou, Wenbo Zhao, Qing Xiao

Energy-based language models (ELMs) parameterize an unnormalized distribution for natural sentences and are radically different from popular autoregressive language models (ALMs).

Sentence speech-recognition +1

Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision

1 code implementation22 May 2023 Yucheng Cai, Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng

Most existing task-oriented dialog (TOD) systems track dialog states in terms of slots and values and use them to query a database to get relevant knowledge to generate responses.

Question Answering Retrieval

Cross-Modal Retrieval for Motion and Text via DopTriple Loss

1 code implementation7 May 2023 Sheng Yan, Yang Liu, Haoqiang Wang, Xin Du, Mengyuan Liu, Hong Liu

On the latest HumanML3D dataset, we achieve a recall of 62. 9% for motion retrieval and 71. 5% for text retrieval (both based on R@10).

Cross-Modal Retrieval Text Retrieval +1

Interweaved Graph and Attention Network for 3D Human Pose Estimation

1 code implementation27 Apr 2023 Ti Wang, Hong Liu, Runwei Ding, Wenhao Li, Yingxuan You, Xia Li

Despite substantial progress in 3D human pose estimation from a single-view image, prior works rarely explore global and local correlations, leading to insufficient learning of human skeleton representations.

3D Human Pose Estimation

Solving Dynamic Traveling Salesman Problems With Deep Reinforcement Learning

1 code implementation journal 2023 Zizhen Zhang, Hong Liu, Mengchu Zhou, Jiahai Wang

This brings in a dynamic version of the traveling salesman problem (DTSP), which takes into account the information of real-time traffic and customer requests.

Deep Reinforcement Learning reinforcement-learning +1

Latent Feature Relation Consistency for Adversarial Robustness

1 code implementation29 Mar 2023 Xingbin Liu, Huafeng Kuang, Hong Liu, Xianming Lin, Yongjian Wu, Rongrong Ji

Deep neural networks have been applied in many computer vision tasks and achieved state-of-the-art performance.

Adversarial Robustness Relation

M3AE: Multimodal Representation Learning for Brain Tumor Segmentation with Missing Modalities

1 code implementation9 Mar 2023 Hong Liu, Dong Wei, Donghuan Lu, Jinghan Sun, Liansheng Wang, Yefeng Zheng

In the first stage, a multimodal masked autoencoder (M3AE) is proposed, where both random modalities (i. e., modality dropout) and random patches of the remaining modalities are masked for a reconstruction task, for self-supervised learning of robust multimodal representations against missing modalities.

Brain Tumor Segmentation Representation Learning +3

Feature Completion Transformer for Occluded Person Re-identification

no code implementations3 Mar 2023 Tao Wang, Mengyuan Liu, Hong Liu, Wenhao Li, Miaoju Ban, Tuanyu Guo, Yidi Li

In this paper, different from most previous works that discard the occluded region, we propose a Feature Completion Transformer (FCFormer) to implicitly complement the semantic information of occluded parts in the feature space.

Occluded Person Re-Identification Triplet

HTNet: Human Topology Aware Network for 3D Human Pose Estimation

1 code implementation20 Feb 2023 Jialun Cai, Hong Liu, Runwei Ding, Wenhao Li, Jianbing Wu, Miaoju Ban

3D human pose estimation errors would propagate along the human body topology and accumulate at the end joints of limbs.

3D Human Pose Estimation

Learning Concordant Attention via Target-aware Alignment for Visible-Infrared Person Re-identification

no code implementations ICCV 2023 Jianbing Wu, Hong Liu, Yuxin Su, Wei Shi, Hao Tang

Owing to the large distribution gap between the heterogeneous data in Visible-Infrared Person Re-identification (VI Re-ID), we point out that existing paradigms often suffer from the inter-modal semantic misalignment issue and thus fail to align and compare local details properly.

Cross-Modal Retrieval Person Re-Identification +1

Uniform Sequence Better: Time Interval Aware Data Augmentation for Sequential Recommendation

1 code implementation16 Dec 2022 Yizhou Dang, Enneng Yang, Guibing Guo, Linying Jiang, Xingwei Wang, Xiaoxiao Xu, Qinghui Sun, Hong Liu

However, we observe that the time interval in a sequence may vary significantly different, and thus result in the ineffectiveness of user modeling due to the issue of \emph{preference drift}.

Data Augmentation Sequential Recommendation

An Unpaired Cross-modality Segmentation Framework Using Data Augmentation and Hybrid Convolutional Networks for Segmenting Vestibular Schwannoma and Cochlea

no code implementations28 Nov 2022 Yuzhou Zhuang, Hong Liu, Enmin Song, Coskun Cetinkaya, Chih-Cheng Hung

We adopt two data augmentation methods for effectively learning the semantic information and generating realistic target domain scans: generative and online data augmentation.

Data Augmentation Segmentation

Self-distillation with Online Diffusion on Batch Manifolds Improves Deep Metric Learning

1 code implementation14 Nov 2022 Zelong Zeng, Fan Yang, Hong Liu, Shin'ichi Satoh

However, this type of method normally ignores the crucial knowledge hidden in the data (e. g., intra-class information variation), which is harmful to the generalization of the trained model.

Metric Learning

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models

no code implementations25 Oct 2022 Hong Liu, Sang Michael Xie, Zhiyuan Li, Tengyu Ma

Toward understanding this implicit bias, we prove that SGD with standard mini-batch noise implicitly prefers flatter minima in language models, and empirically observe a strong correlation between flatness and downstream performance among models with the same minimal pre-training loss.

Language Modelling

A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems

1 code implementation17 Oct 2022 Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng

Second, an important ingredient in a US is that the user goal can be effectively incorporated and tracked; but how to flexibly integrate goal state tracking and develop an end-to-end trainable US for multi-domains has remained to be a challenge.

Reinforcement Learning (RL)

Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture

no code implementations13 Oct 2022 Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng

Recently, there has been progress in supervised funetuning pretrained GPT-2 to build end-to-end task-oriented dialog (TOD) systems.

Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset

1 code implementation27 Sep 2022 Hong Liu, Hao Peng, Zhijian Ou, Juanzi Li, Yi Huang, Junlan Feng

Recently, there have merged a class of task-oriented dialogue (TOD) datasets collected through Wizard-of-Oz simulated games.

Identity-Sensitive Knowledge Propagation for Cloth-Changing Person Re-identification

1 code implementation25 Aug 2022 Jianbing Wu, Hong Liu, Wei Shi, Hao Tang, Jingwen Guo

To mitigate the resolution degradation issue and mine identity-sensitive cues from human faces, we propose to restore the missing facial details using prior facial knowledge, which is then propagated to a smaller network.

Cloth-Changing Person Re-Identification Human Parsing

Advancing Semi-Supervised Task Oriented Dialog Systems by JSA Learning of Discrete Latent Variable Models

1 code implementation SIGDIAL (ACL) 2022 Yucheng Cai, Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng

In this paper, we propose to apply JSA to semi-supervised learning of the latent state TOD models, which is referred to as JSA-TOD.

Spatiotemporal Propagation Learning for Network-Wide Flight Delay Prediction

1 code implementation14 Jul 2022 Yuankai Wu, Hongyu Yang, Yi Lin, Hong Liu

By this means, STPN allows cross-talk of spatial and temporal factors for modeling delay propagation.

Decision Making Time Series Analysis

Contrastive Learning from Spatio-Temporal Mixed Skeleton Sequences for Self-Supervised Skeleton-Based Action Recognition

1 code implementation7 Jul 2022 Zhan Chen, Hong Liu, Tianyu Guo, Zhengyan Chen, Pinhao Song, Hao Tang

First, SkeleMix utilizes the topological information of skeleton data to mix two skeleton sequences by randomly combing the cropped skeleton fragments (the trimmed view) with the remaining skeleton sequences (the truncated view).

Action Recognition Contrastive Learning +3

A Challenge on Semi-Supervised and Reinforced Task-Oriented Dialog Systems

1 code implementation6 Jul 2022 Zhijian Ou, Junlan Feng, Juanzi Li, Yakun Li, Hong Liu, Hao Peng, Yi Huang, Jiangjiang Zhao

A challenge on Semi-Supervised and Reinforced Task-Oriented Dialog Systems, Co-located with EMNLP2022 SereTOD Workshop.

Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition

1 code implementation27 Jun 2022 Zhan Chen, Sicheng Li, Bing Yang, Qinghan Li, Hong Liu

To solve this problem, we present a multi-scale spatial graph convolution (MS-GC) module and a multi-scale temporal graph convolution (MT-GC) module to enrich the receptive field of the model in spatial and temporal dimensions.

Skeleton Based Action Recognition

Transformers Improve Breast Cancer Diagnosis from Unregistered Multi-View Mammograms

no code implementations21 Jun 2022 Xuxin Chen, Ke Zhang, Neman Abdoli, Patrik W. Gilley, Ximin Wang, Hong Liu, Bin Zheng, Yuchen Qiu

For this purpose, we employ local Transformer blocks to separately learn patch relationships within four mammograms acquired from two-view (CC/MLO) of two-side (right/left) breasts.

Image Registration

GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation

1 code implementation13 Jun 2022 Wenhao Li, Mengyuan Liu, Hong Liu, Tianyu Guo, Ti Wang, Hao Tang, Nicu Sebe

To the best of our knowledge, this is the first MLP-Like architecture for 3D human pose estimation in a single frame and a video sequence.

3D Human Pose Estimation Representation Learning

AO2-DETR: Arbitrary-Oriented Object Detection Transformer

1 code implementation25 May 2022 Linhui Dai, Hong Liu, Hao Tang, Zhiwei Wu, Pinhao Song

Comprehensive experiments on several challenging datasets show that our method achieves superior performance on the AOOD task.

Decoder Inductive Bias +5

Building Markovian Generative Architectures over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems

2 code implementations13 Apr 2022 Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng

Recently, Transformer based pretrained language models (PLMs), such as GPT2 and T5, have been leveraged to build generative task-oriented dialog (TOD) systems.

Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D Networks for 3D Coherent Layer Segmentation of Retina OCT Images

1 code implementation4 Mar 2022 Hong Liu, Dong Wei, Donghuan Lu, Yuexiang Li, Kai Ma, Liansheng Wang, Yefeng Zheng

To the best of our knowledge, this is the first study that attempts 3D retinal layer segmentation in volumetric OCT images based on CNNs.

Segmentation

Virtual Adversarial Training for Semi-supervised Breast Mass Classification

no code implementations25 Jan 2022 Xuxin Chen, Ximin Wang, Ke Zhang, Kar-Ming Fung, Theresa C. Thai, Kathleen Moore, Robert S. Mannel, Hong Liu, Bin Zheng, Yuchen Qiu

This study aims to develop a novel computer-aided diagnosis (CAD) scheme for mammographic breast mass classification using semi-supervised learning.

Classification Medical Image Analysis

Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking

1 code implementation14 Dec 2021 Yidi Li, Hong Liu, Hao Tang

Multi-modal fusion is proven to be an effective method to improve the accuracy and robustness of speaker tracking, especially in complex scenarios.

Self-Supervised Learning

Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition

1 code implementation7 Dec 2021 Tianyu Guo, Hong Liu, Zhan Chen, Mengyuan Liu, Tao Wang, Runwei Ding

In this paper, to make better use of the movement patterns introduced by extreme augmentations, a Contrastive Learning framework utilizing Abundant Information Mining for self-supervised action Representation (AimCLR) is proposed.

Contrastive Learning Few-Shot Skeleton-Based Action Recognition +5

Pose-guided Feature Disentangling for Occluded Person Re-identification Based on Transformer

1 code implementation5 Dec 2021 Tao Wang, Hong Liu, Pinhao Song, Tianyu Guo, Wei Shi

Therefore, we propose a transformer-based Pose-guided Feature Disentangling (PFD) method by utilizing pose information to clearly disentangle semantic components (e. g. human body or joint parts) and selectively match non-occluded parts correspondingly.

Decoder Occluded Person Re-Identification

Improving Camouflaged Object Detection with the Uncertainty of Pseudo-edge Labels

1 code implementation29 Oct 2021 Nobukatsu Kajiura, Hong Liu, Shin'ichi Satoh

This framework consists of three key components, i. e., a pseudo-edge generator, a pseudo-map generator, and an uncertainty-aware refinement module.

object-detection Object Detection

Self-supervised Learning is More Robust to Dataset Imbalance

1 code implementation ICLR 2022 Hong Liu, Jeff Z. HaoChen, Adrien Gaidon, Tengyu Ma

Third, inspired by the theoretical insights, we devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets with several evaluation criteria, closing the small gap between balanced and imbalanced datasets with the same number of examples.

Long-tail Learning Self-Supervised Learning

Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection

1 code implementation5 Oct 2021 Zhirong Ye, Xiangdong Wang, Hong Liu, Yueliang Qian, Rui Tao, Long Yan, Kazushige Ouchi

A critical issue with the frame-based model is that it pursues the best frame-level prediction rather than the best event-level prediction.

Audio Tagging Boundary Detection +5

Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling

no code implementations29 Sep 2021 Bei Yang, Ke Liu, Xiaoxiao Xu, Renjun Xu, Hong Liu, Huan Xu

However, existing researches have little ability to model universal user representation based on lifelong behavior sequences since user registration.

Contrastive Learning Dimensionality Reduction +2

Interest-oriented Universal User Representation via Contrastive Learning

no code implementations18 Sep 2021 Qinghui Sun, Jie Gu, Bei Yang, Xiaoxiao Xu, Renjun Xu, Shangde Gao, Hong Liu, Huan Xu

Universal user representation has received many interests recently, with which we can be free from the cumbersome work of training a specific model for each downstream application.

Contrastive Learning Representation Learning +1

Variational Latent-State GPT for Semi-Supervised Task-Oriented Dialog Systems

2 code implementations9 Sep 2021 Hong Liu, Yucheng Cai, Zhenru Lin, Zhijian Ou, Yi Huang, Junlan Feng

In this paper, we propose Variational Latent-State GPT model (VLS-GPT), which is the first to combine the strengths of the two approaches.

Transformer for Single Image Super-Resolution

1 code implementation25 Aug 2021 Zhisheng Lu, Juncheng Li, Hong Liu, Chaoyan Huang, Linlin Zhang, Tieyong Zeng

LTB is composed of a series of Efficient Transformers (ET), which occupies a small GPU memory occupation, thanks to the specially designed Efficient Multi-Head Attention (EMHA).

Image Super-Resolution

Towards Robustness Against Natural Language Word Substitutions

1 code implementation ICLR 2021 Xinshuai Dong, Anh Tuan Luu, Rongrong Ji, Hong Liu

Robustness against word substitutions has a well-defined and widely acceptable form, i. e., using semantically similar words as substitutions, and thus it is considered as a fundamental stepping-stone towards broader robustness in natural language processing.

Natural Language Inference Sentiment Analysis

Recent advances and clinical applications of deep learning in medical image analysis

no code implementations27 May 2021 Xuxin Chen, Ximin Wang, Ke Zhang, Kar-Ming Fung, Theresa C. Thai, Kathleen Moore, Robert S. Mannel, Hong Liu, Bin Zheng, Yuchen Qiu

Deep learning has received extensive research interest in developing new medical image processing algorithms, and deep learning based models have been remarkably successful in a variety of medical imaging tasks to support disease detection and diagnosis.

Deep Learning Image Registration +2

Stability from graph symmetrisation arguments with applications to inducibility

no code implementations19 Dec 2020 Hong Liu, Oleg Pikhurko, Maryam Sharifzadeh, Katherine Staden

We present a sufficient condition for the stability property of extremal graph problems that can be solved via Zykov's symmetrisation.

Combinatorics

Part-based Lipreading for Audio-Visual Speech Recognition

no code implementations IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2020 Ziling Miao, Hong Liu, Bing Yang

In this paper, A part-based lipreading (PBL) method is proposed to deal with the mismatch between an overall lip model and the separate parts of lips, also the excessive dependence of models on the speakers in training set.

Audio-Visual Speech Recognition Lipreading +2

Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification

no code implementations12 Dec 2020 Can Zhang, Hong Liu, Wei Guo, Mang Ye

RGB-Infrared person re-identification (RGB-IR Re-ID) aims to match persons from heterogeneous images captured by visible and thermal cameras, which is of great significance in the surveillance system under poor light conditions.

Person Re-Identification

Extremal density for sparse minors and subdivisions

no code implementations3 Dec 2020 John Haslegrave, JaeHoon Kim, Hong Liu

We prove an asymptotically tight bound on the extremal density guaranteeing subdivisions of bounded-degree bipartite graphs with a mild separability condition.

Combinatorics 05C83, 05C35

Learning to Adapt to Evolving Domains

1 code implementation NeurIPS 2020 Hong Liu, Mingsheng Long, Jianmin Wang, Yu Wang

(2) Since the target data arrive online, the agent should also maintain competence on previous target domains, i. e. to adapt without forgetting.

Meta-Learning Transfer Learning +1

Meta-learning Transferable Representations with a Single Target Domain

no code implementations3 Nov 2020 Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma

Recent works found that fine-tuning and joint training---two popular approaches for transfer learning---do not always improve accuracy on downstream tasks.

Meta-Learning Representation Learning +1

Lip Graph Assisted Audio-Visual Speech Recognition Using Bidirectional Synchronous Fusion

no code implementations Interspeech 2020 Hong Liu, Zhan Chen, Bing Yang

Second, the hybrid visual stream is combined with the audio stream by an attention-based bidirectional synchronous fusion which allows bidirectional information interaction to resolve the asynchrony between the two modalities during fusion.

Audio-Visual Speech Recognition Landmark-based Lipreading +2

Anti-Bandit Neural Architecture Search for Model Defense

no code implementations ECCV 2020 Hanlin Chen, Baochang Zhang, Song Xue, Xuan Gong, Hong Liu, Rongrong Ji, David Doermann

Deep convolutional neural networks (DCNNs) have dominated as the best performers in machine learning, but can be challenged by adversarial attacks.

Denoising Neural Architecture Search

Two-stage growth mode for lift-off mechanism in oblique shock-wave/jet interaction

no code implementations11 Jul 2020 Bin Yu, Miaosheng He, Bin Zhang, Hong Liu

Based on the objective coordinate system in frame of oblique shock structure, it is found that the nature of three-dimensional lift-off structure of a shockinduced streamwise vortex is inherently and precisely controlled by a two-stage growth mode of structure kinetics of a shock bubble interaction (SBI for short).

Fluid Dynamics

Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification

1 code implementation1 Jun 2020 Hanrong Ye, Hong Liu, Fanyang Meng, Xia Li

As an angularly discriminative feature space is important for classifying the human images based on their embedding vectors, in this paper, we propose a novel ranking loss function, named Bi-directional Exponential Angular Triplet Loss, to help learn an angularly separable common feature space by explicitly constraining the included angles between embedding vectors.

Person Re-Identification Triplet

Projection & Probability-Driven Black-Box Attack

1 code implementation CVPR 2020 Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian

For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.

WQT and DG-YOLO: towards domain generalization in underwater object detection

no code implementations14 Apr 2020 Hong Liu, Pinhao Song, Runwei Ding

This paper aims to build a GUOD with small underwater dataset with limited types of water quality.

Data Augmentation Diversity +4

Online Initialization and Extrinsic Spatial-Temporal Calibration for Monocular Visual-Inertial Odometry

no code implementations12 Apr 2020 Weibo Huang, Hong Liu, Weiwei Wan

To compensate for the impact of time offset, our method includes two short-term motion interpolation algorithms for the camera and IMU pose estimation.

Motion Interpolation Pose Estimation

Spatial Pyramid Based Graph Reasoning for Semantic Segmentation

no code implementations CVPR 2020 Xia Li, Yibo Yang, Qijie Zhao, Tiancheng Shen, Zhouchen Lin, Hong Liu

The convolution operation suffers from a limited receptive filed, while global modeling is fundamental to dense prediction tasks, such as semantic segmentation.

Segmentation Semantic Segmentation

A Survey on 3D Skeleton-Based Action Recognition Using Learning Method

no code implementations14 Feb 2020 Bin Ren, Mengyuan Liu, Runwei Ding, Hong Liu

To the best of our knowledge, this research represents the first comprehensive discussion of deep learning-based action recognition using 3D skeleton data.

Action Recognition Deep Learning +1

Unified Generative Adversarial Networks for Controllable Image-to-Image Translation

1 code implementation12 Dec 2019 Hao Tang, Hong Liu, Nicu Sebe

The proposed model consists of a single generator and a discriminator taking a conditional image and the target controllable structure as input.

Facial Expression Translation Generative Adversarial Network +3

AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks

2 code implementations27 Nov 2019 Hao Tang, Hong Liu, Dan Xu, Philip H. S. Torr, Nicu Sebe

State-of-the-art methods in image-to-image translation are capable of learning a mapping from a source domain to a target domain with unpaired image data.

Image-to-Image Translation Translation

An End-to-end Approach for Lexical Stress Detection based on Transformer

no code implementations6 Nov 2019 Yong Ruan, Xiangdong Wang, Hong Liu, Zhigang Ou, Yun Gao, Jianfeng Cheng, Yueliang Qian

For this, we train transformer model using feature sequence of audio and their phoneme sequence with lexical stress marks.

General Classification

Universal Adversarial Perturbation via Prior Driven Uncertainty Approximation

no code implementations ICCV 2019 Hong Liu, Rongrong Ji, Jie Li, Baochang Zhang, Yue Gao, Yongjian Wu, Feiyue Huang

Deep learning models have shown their vulnerabilities to universal adversarial perturbations (UAP), which are quasi-imperceptible.

Towards Understanding the Transferability of Deep Representations

no code implementations26 Sep 2019 Hong Liu, Mingsheng Long, Jian-Min Wang, Michael. I. Jordan

3) The feasibility of transferability is related to the similarity of both input and label.

Guided Learning Convolution System for DCASE 2019 Task 4

1 code implementation11 Sep 2019 Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian

In this paper, we describe in detail the system we submitted to DCASE2019 task 4: sound event detection (SED) in domestic environments.

Event Detection Sound Event Detection

Identifying Illicit Accounts in Large Scale E-payment Networks -- A Graph Representation Learning Approach

no code implementations13 Jun 2019 Da Sun Handason Tam, Wing Cheong Lau, Bin Hu, Qiu Fang Ying, Dah Ming Chiu, Hong Liu

In the context of e-payment transaction graphs, the resultant node and edge embeddings can effectively characterize the user-background as well as the financial transaction patterns of individual account holders.

Graph Embedding Graph Mining +2

Guided learning for weakly-labeled semi-supervised sound event detection

1 code implementation6 Jun 2019 Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian

Instead of designing a single model by considering a trade-off between the two sub-targets, we design a teacher model aiming at audio tagging to guide a student model aiming at boundary detection to learn using the unlabeled data.

Audio Tagging Boundary Detection +3

Separate to Adapt: Open Set Domain Adaptation via Progressive Separation

no code implementations CVPR 2019 Hong Liu, Zhangjie Cao, Mingsheng Long, Jianmin Wang, Qiang Yang

While several methods have been proposed to address OSDA, none of them takes into account the openness of the target domain, which is measured by the proportion of unknown classes in all target classes.

Domain Adaptation

Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection

1 code implementation24 May 2019 Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian

In this paper, a special decision surface for the weakly-supervised sound event detection (SED) and a disentangled feature (DF) for the multi-label problem in polyphonic SED are proposed.

Event Detection Multi-Label Classification +2

A novel algorithm for segmentation of leukocytes in peripheral blood

no code implementations21 May 2019 Haichao Cao, Hong Liu, Enmin Song

First, the nucleus of leukocyte was separated by using the stepwise averaging method.

Dual-branch residual network for lung nodule segmentation

no code implementations21 May 2019 Haichao Cao, Hong Liu, Enmin Song, Chih-Cheng Hung, Guangzhi Ma, Xiangyang Xu, Renchao Jin, Jianguo Lu

Experimental results show that the DB-ResNet achieves superior segmentation performance with an average dice score of 82. 74% on the dataset.

Computed Tomography (CT) Lung Nodule Segmentation +1

Hadamard Matrix Guided Online Hashing

1 code implementation11 May 2019 Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Shen Chen, Qi Tian

We then treat the learning of hash functions as a set of binary classification problems to fit the assigned target code.

Binary Classification

Two-Stage Convolutional Neural Network Architecture for Lung Nodule Detection

no code implementations9 May 2019 Haichao Cao, Hong Liu, Enmin Song, Guangzhi Ma, Xiangyang Xu, Renchao Jin, Tengying Liu, Chih-Cheng Hung

The CNN architecture in the first stage is based on the improved UNet segmentation network to establish an initial detection of lung nodules.

Computed Tomography (CT) Data Augmentation +4

Supervised Online Hashing via Hadamard Codebook Learning

1 code implementation28 Apr 2019 Mingbao Lin, Rongrong Ji, Hong Liu, Yongjian Liu

Notably, the proposed HCOH can be embedded with supervised labels and it not limited to a predefined category number.

Retrieval Semantic Similarity +1

Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification

no code implementations8 Apr 2019 Yong Luo, DaCheng Tao, Chang Xu, Chao Xu, Hong Liu, Yonggang Wen

In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e. g. pedestrian, bicycle and tree) and is properly characterized by multiple visual features (e. g. color, texture and shape).

General Classification Multi-Label Image Classification

Towards Optimal Discrete Online Hashing with Balanced Similarity

1 code implementation29 Jan 2019 Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Yongjian Wu, Yunsheng Wu

In this paper, we propose a novel supervised online hashing method, termed Balanced Similarity for Online Discrete Hashing (BSODH), to solve the above problems in a unified framework.

Retrieval

Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

1 code implementation15 Jan 2019 Hao Tang, Hong Liu, Wei Xiao, Nicu Sebe

Gesture recognition is a hot topic in computer vision and pattern recognition, which plays a vitally important role in natural human-computer interface.

Clustering Hand Gesture Recognition +1