Search Results for author: Xiao Liu

Found 173 papers, 74 papers with code

Dual-Channel Evidence Fusion for Fact Verification over Texts and Tables

no code implementations NAACL 2022 Nan Hu, Zirui Wu, Yuxuan Lai, Xiao Liu, Yansong Feng

Different from previous fact extraction and verification tasks that only consider evidence of a single format, FEVEROUS brings further challenges by extending the evidence format to both plain text and tables.

Fact Verification

SeqDialN: Sequential Visual Dialog Network in Joint Visual-Linguistic Representation Space

1 code implementation ACL (dialdoc) 2021 Liu Yang, Fanqi Meng, Xiao Liu, Ming-Kuang Daniel Wu, Vicent Ying, James Xu

In this work, we formulate a visual dialog as an information flow in which each piece of information is encoded with the joint visual-linguistic representation of a single dialog round.

Visual Dialog

Geo-BERT Pre-training Model for Query Rewriting in POI Search

no code implementations Findings (EMNLP) 2021 Xiao Liu, Juan Hu, Qi Shen, Huan Chen

Finally, we train a BERT-like pre-training model with text and POIs’ graph embeddings to get an integrated representation of both geographic and semantic information, and apply it in the QR of POI search.

Graph Representation Learning

An Attention-driven Two-stage Clustering Method for Unsupervised Person Re-Identification

no code implementations ECCV 2020 Zilong Ji, Xiaolong Zou, Xiaohan Lin, Xiao Liu, Tiejun Huang, Si Wu

By iteratively learning with the two strategies, the attentive regions are gradually shifted from the background to the foreground and the features become more discriminative.

Unsupervised Person Re-Identification

P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks

no code implementations ACL 2022 Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers

1 code implementation15 Dec 2022 Kun Zhou, Xiao Liu, Yeyun Gong, Wayne Xin Zhao, Daxin Jiang, Nan Duan, Ji-Rong Wen

Dense retrieval aims to map queries and passages into low-dimensional vector space for efficient similarity measuring, showing promising effectiveness in various large-scale retrieval tasks.

Passage Retrieval Retrieval

An adaptive human-in-the-loop approach to emission detection of Additive Manufacturing processes and active learning with computer vision

no code implementations12 Dec 2022 Xiao Liu, Alan F. Smeaton, Alessandra Mileo

More specifically, this paper will look at two scenarios: firstly, using convolutional neural networks (CNNs) to automatically inspect and classify emission data collected by in-situ monitoring and secondly, applying Active Learning techniques to the developed classification model to construct a human-in-the-loop mechanism in order to accelerate the labeling process of the emission data.

Active Learning Transfer Learning

LEAD: Liberal Feature-based Distillation for Dense Retrieval

no code implementations10 Dec 2022 Hao Sun, Xiao Liu, Yeyun Gong, Anlei Dong, Jian Jiao, Jingwen Lu, Yan Zhang, Daxin Jiang, Linjun Yang, Rangan Majumder, Nan Duan

Knowledge distillation is often used to transfer knowledge from a strong teacher model to a relatively weak student model.

Knowledge Distillation Retrieval

ClueWeb22: 10 Billion Web Documents with Visual and Semantic Information

no code implementations29 Nov 2022 Arnold Overwijk, Chenyan Xiong, Xiao Liu, Cameron VandenBerg, Jamie Callan

ClueWeb22, the newest iteration of the ClueWeb line of datasets, provides 10 billion web pages affiliated with rich information.


Imperceptible Adversarial Attack via Invertible Neural Networks

1 code implementation28 Nov 2022 Zihan Chen, Ziyue Wang, JunJie Huang, Wentao Zhao, Xiao Liu, Dejian Guan

Adding perturbations via utilizing auxiliary gradient information or discarding existing details of the benign images are two common approaches for generating adversarial examples.

Adversarial Attack

Reason from Context with Self-supervised Learning

no code implementations23 Nov 2022 Xiao Liu, Ankur Sikarwar, Joo Hwee Lim, Gabriel Kreiman, Zenglin Shi, Mengmi Zhang

Context reasoning is critical in visual recognition, where current inputs need to be interpreted in the light of previous experience and knowledge.

Self-Supervised Learning

Does Debiasing Inevitably Degrade the Model Performance

no code implementations14 Nov 2022 Yiran Liu, Xiao Liu, Haotian Chen, Yang Yu

We use our theoretical framework to explain why the current debiasing methods cause performance degradation.

IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models

no code implementations25 Oct 2022 Chenguang Wang, Xiao Liu, Dawn Song

Instead of focusing on pre-defined relations, we create an OIE benchmark aiming to fully examine the open relational information present in the pre-trained LMs.

Open Information Extraction

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

1 code implementation21 Oct 2022 Kun Zhou, Yeyun Gong, Xiao Liu, Wayne Xin Zhao, Yelong Shen, Anlei Dong, Jingwen Lu, Rangan Majumder, Ji-Rong Wen, Nan Duan, Weizhu Chen

Thus, we propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.

Retrieval Text Retrieval

Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario

1 code implementation20 Oct 2022 Xiao Liu, Yansong Feng, Jizhi Tang, Chengang Hu, Dongyan Zhao

Although pretrained language models can generate fluent recipe texts, they fail to truly learn and use the culinary knowledge in a compositional way.

Pretrained Language Models Recipe Generation

Diffusion Models for Causal Discovery via Topological Ordering

1 code implementation12 Oct 2022 Pedro Sanchez, Xiao Liu, Alison Q O'Neil, Sotirios A. Tsaftaris

Topological ordering approaches for causal discovery exploit this by performing graph discovery in two steps, first sequentially identifying nodes in reverse order of depth (topological ordering), and secondly pruning the potential relations.

Causal Discovery

Learning Credit Assignment for Cooperative Reinforcement Learning

no code implementations10 Oct 2022 Wubing Chen, Wenbin Li, Xiao Liu, Shangdong Yang

Empirically, we evaluate MAPPG on the well-known matrix game and differential game, and verify that MAPPG can converge to the global optimum for both discrete and continuous action spaces.

reinforcement-learning reinforcement Learning +2

HEGEL: Hypergraph Transformer for Long Document Summarization

no code implementations9 Oct 2022 Haopeng Zhang, Xiao Liu, Jiawei Zhang

Extractive summarization for long documents is challenging due to the extended structured input context.

Document Summarization Extractive Summarization

Uplifting Message Passing Neural Network with Graph Original Information

no code implementations8 Oct 2022 Xiao Liu, Lijun Zhang, Hui Guan

Message passing neural networks (MPNNs) learn the representation of graph-structured data based on graph original information, including node features and graph structures, and have shown astonishing improvement in node classification tasks.

Graph Representation Learning Node Classification

PROD: Progressive Distillation for Dense Retrieval

no code implementations27 Sep 2022 Zhenghao Lin, Yeyun Gong, Xiao Liu, Hang Zhang, Chen Lin, Anlei Dong, Jian Jiao, Jingwen Lu, Daxin Jiang, Rangan Majumder, Nan Duan

It is common that a better teacher model results in a bad student via distillation due to the nonnegligible gap between teacher and student.

Knowledge Distillation Natural Questions +1

Diverse Title Generation for Stack Overflow Posts with Multiple Sampling Enhanced Transformer

1 code implementation24 Aug 2022 Fengji Zhang, Jin Liu, Yao Wan, Xiao Yu, Xiao Liu, Jacky Keung

Stack Overflow is one of the most popular programming communities where developers can seek help for their encountered problems.

Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries

1 code implementation16 Aug 2022 Xiao Liu, Shiyu Zhao, Kai Su, Yukuo Cen, Jiezhong Qiu, Mengdi Zhang, Wei Wu, Yuxiao Dong, Jie Tang

In this work, we present the Knowledge Graph Transformer (kgTransformer) with masked pre-training and fine-tuning strategies.

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

1 code implementation23 Jul 2022 Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai

Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.

Optical Character Recognition

BCRLSP: An Offline Reinforcement Learning Framework for Sequential Targeted Promotion

no code implementations16 Jul 2022 Fanglin Chen, Xiao Liu, Bo Tang, Feiyu Xiong, Serim Hwang, Guomian Zhuang

During deployment, we combine the offline RL model with the LP model to generate a robust policy under the budget constraints.

Offline RL reinforcement-learning +1

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

2 code implementations14 Jul 2022 Weng Lam Tam, Xiao Liu, Kaixuan Ji, Lilong Xue, Xingjian Zhang, Yuxiao Dong, Jiahua Liu, Maodi Hu, Jie Tang

By updating only 0. 1% of the model parameters, the prompt tuning strategy can help retrieval models achieve better generalization performance than traditional methods in which all parameters are updated.

Retrieval Text Retrieval

Why patient data cannot be easily forgotten?

no code implementations29 Jun 2022 Ruolin Su, Xiao Liu, Sotirios A. Tsaftaris

With the advent of AI learned on data, one can imagine that such rights can extent to requests for forgetting knowledge of patient's data within AI models.

vMFNet: Compositionality Meets Domain-generalised Segmentation

1 code implementation29 Jun 2022 Xiao Liu, Spyridon Thermos, Pedro Sanchez, Alison Q. O'Neil, Sotirios A. Tsaftaris

Moreover, with a reconstruction module, unlabeled data can also be used to learn the vMF kernels and likelihoods by recombining them to reconstruct the input image.

Anatomy Image Segmentation +2

Physics-Informed Statistical Modeling for Wildfire Aerosols Process Using Multi-Source Geostationary Satellite Remote-Sensing Data Streams

1 code implementation23 Jun 2022 Guanzhou Wei, Venkat Krishnan, Yu Xie, Manajit Sengupta, Yingchen Zhang, Haitao Liao, Xiao Liu

Increasingly frequent wildfires significantly affect solar energy production as the atmospheric aerosols generated by wildfires diminish the incoming solar radiation to the earth.

Regression Trees on Grassmann Manifold for Adapting Reduced-Order Models

no code implementations22 Jun 2022 Xiao Liu, Xinchao Liu

When a ROM, constructed using the POD basis obtained from training data, is applied to new parameter settings, the model often lacks robustness against the change of parameters in design, control, and other real-time operation problems.


Improving Subgraph Representation Learning via Multi-View Augmentation

no code implementations25 May 2022 Yili Shen, Xiao Liu, Cheng-Wei Ju, Jiaxu Yan, Jun Yi, Zhou Lin, Hui Guan

Subgraph representation learning based on Graph Neural Network (GNN) has exhibited broad applications in scientific advancements, such as predictions of molecular structure-property relationships and collective cellular function.

Representation Learning

GraphMAE: Self-Supervised Masked Graph Autoencoders

1 code implementation22 May 2022 Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, Jie Tang

Despite this, contrastive learning-which heavily relies on structural data augmentation and complicated training strategies-has been the dominant approach in graph SSL, while the progress of generative SSL on graphs, especially graph autoencoders (GAEs), has thus far not reached the potential as promised in other fields.

Contrastive Learning Graph Classification +4

MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection

no code implementations4 Apr 2022 Xuri Ge, Joemon M. Jose, Songpei Xu, Xiao Liu, Hu Han

While the region-level feature learning from local face patches features via graph neural network can encode the correlation across different AUs, the pixel-wise and channel-wise feature learning via graph attention network can enhance the discrimination ability of AU features from global face features.

Graph Attention Relational Reasoning

Things not Written in Text: Exploring Spatial Commonsense from Visual Signals

1 code implementation ACL 2022 Xiao Liu, Da Yin, Yansong Feng, Dongyan Zhao

We probe PLMs and models with visual signals, including vision-language pretrained models and image synthesis models, on this benchmark, and find that image synthesis models are more capable of learning accurate and consistent spatial knowledge than other models.

Image Generation Natural Language Understanding +1

A Tree-Structured Multi-Task Model Recommender

1 code implementation10 Mar 2022 Lijun Zhang, Xiao Liu, Hui Guan

Tree-structured multi-task architectures have been employed to jointly tackle multiple vision tasks in the context of multi-task learning (MTL).

Multi-Task Learning

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

2 code implementations CVPR 2022 Ye Yuan, Xiao Liu, Wondimu Dikubab, Hui Liu, Zhilong Ji, Zhongqin Wu, Xiang Bai

In this paper, we propose a simple and efficient method for HMER, which is the first to incorporate syntax information into an encoder-decoder network.

Automatic Facial Paralysis Estimation with Facial Action Units

no code implementations3 Mar 2022 Xuri Ge, Joemon M. Jose, Pengcheng Wang, Arunachalam Iyer, Xiao Liu, Hu Han

In this paper, we propose a novel Adaptive Local-Global Relational Network (ALGRNet) for facial AU detection and use it to classify facial paralysis severity.

Keeping Minimal Experience to Achieve Efficient Interpretable Policy Distillation

no code implementations2 Mar 2022 Xiao Liu, Shuyang Liu, Wenbin Li, Shangdong Yang, Yang Gao

Although deep reinforcement learning has become a universal solution for complex control tasks, its real-world applicability is still limited because lacking security guarantees for policies.

SelfKG: Self-Supervised Entity Alignment in Knowledge Graphs

1 code implementation2 Mar 2022 Xiao Liu, Haoyun Hong, Xinghao Wang, Zeyi Chen, Evgeny Kharlamov, Yuxiao Dong, Jie Tang

We present SelfKG with efficient strategies to optimize this objective for aligning entities without label supervision.

Entity Alignment Knowledge Graphs +1

3D Intracranial Aneurysm Classification and Segmentation via Unsupervised Dual-branch Learning

no code implementations6 Jan 2022 Di Shao, Xuequan Lu, Xiao Liu

While most existing deep learning research focused on medical images in a supervised way, we introduce an unsupervised method for the detection of intracranial aneurysms based on 3D point cloud data.

Unsupervised Pre-training

All-in-One Image Restoration for Unknown Corruption

1 code implementation CVPR 2022 Boyun Li, Xiao Liu, Peng Hu, Zhongqin Wu, Jiancheng Lv, Xi Peng

In this paper, we study a challenging problem in image restoration, namely, how to develop an all-in-one method that could recover images from a variety of unknown corruption types and levels.

Image Restoration

MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment

1 code implementation2 Dec 2021 Jie Ren, Wenteng Liang, Ran Yan, Luo Mai, Shiwen Liu, Xiao Liu

Large-scale Bundle Adjustment (BA) requires massive memory and computation resources which are difficult to be fulfilled by existing BA libraries.

Learning with Noisy Correspondence for Cross-modal Matching

1 code implementation NeurIPS 2021 Zhenyu Huang, guocheng niu, Xiao Liu, Wenbiao Ding, Xinyan Xiao, Hua Wu, Xi Peng

Based on this observation, we reveal and study a latent and challenging direction in cross-modal matching, named noisy correspondence, which could be regarded as a new paradigm of noisy labels.

Cross-Modal Retrieval Memorization +2

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

1 code implementation CVPR 2022 Yikang Ding, Wentao Yuan, Qingtian Zhu, Haotian Zhang, Xiangyue Liu, Yuanjiang Wang, Xiao Liu

We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images.

3D Reconstruction

MegLoc: A Robust and Accurate Visual Localization Pipeline

no code implementations25 Nov 2021 Shuxue Peng, Zihang He, Haotian Zhang, Ran Yan, Chuting Wang, Qingtian Zhu, Xiao Liu

In this paper, we present a visual localization pipeline, namely MegLoc, for robust and accurate 6-DoF pose estimation under varying scenarios, including indoor and outdoor scenes, different time across a day, different seasons across a year, and even across years.

Autonomous Driving Pose Estimation +1

Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

no code implementations19 Nov 2021 Yuezhou Sun, Wenlong Zhao, Lijun Zhang, Xiao Liu, Hui Guan, Matei Zaharia

This paper investigates deep neural network (DNN) compression from the perspective of compactly representing and storing trained parameters.

Neural Network Compression Quantization

Meta-learning for RIS-assisted NOMA Networks

no code implementations4 Nov 2021 Yixuan Zou, Yuanwei Liu, Kaifeng Han, Xiao Liu, Kok Keong Chai

Extensive simulation results demonstrate that the proposed QoS-based NOMA network achieves significantly higher transmission throughput compared to the conventional orthogonal multiple access (OMA) network.


AutoMTL: A Programming Framework for Automating Efficient Multi-Task Learning

1 code implementation25 Oct 2021 Lijun Zhang, Xiao Liu, Hui Guan

The first challenge is to determine what parameters to share across tasks to optimize for both memory efficiency and task accuracy.

Multi-Task Learning

Deep Point Cloud Normal Estimation via Triplet Learning

no code implementations20 Oct 2021 Weijia Wang, Xuequan Lu, Dasith de Silva Edirimuni, Xiao Liu, Antonio Robles-Kelly

It consists of two phases: (a) feature encoding which learns representations of local patches, and (b) normal estimation that takes the learned representation as input and regresses the normal vector.

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

2 code implementations14 Oct 2021 Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

Learning the Representation of Behavior Styles with Imitation Learning

no code implementations29 Sep 2021 Xiao Liu, Meng Wang, Zhaorong Wang, Yingfeng Chen, Yujing Hu, Changjie Fan, Chongjie Zhang

Imitation learning is one of the methods for reproducing expert demonstrations adaptively by learning a mapping between observations and actions.

Imitation Learning

LOF: Structure-Aware Line Tracking based on Optical Flow

no code implementations17 Sep 2021 Meixiang Quan, Zheng Chai, Xiao Liu

Lines provide the significantly richer geometric structural information about the environment than points, so lines are widely used in recent Visual Odometry (VO) works.

Line Detection Optical Flow Estimation +1

Learning Disentangled Representations in the Imaging Domain

1 code implementation26 Aug 2021 Xiao Liu, Pedro Sanchez, Spyridon Thermos, Alison Q. O'Neil, Sotirios A. Tsaftaris

Disentangled representation learning has been proposed as an approach to learning general representations even in the absence of, or with limited, supervision.

Representation Learning

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation

1 code implementation ICCV 2021 Yuxiang Wei, Yupeng Shi, Xiao Liu, Zhilong Ji, Yuan Gao, Zhongqin Wu, WangMeng Zuo

It simply encourages the variation of output caused by perturbations on different latent dimensions to be orthogonal, and the Jacobian with respect to the input is calculated to represent this variation.

Disentanglement Image Generation

Method Towards CVPR 2021 Image Matching Challenge

no code implementations10 Aug 2021 Xiaopeng Bi, Yu Chen, Xinyang Liu, Dehao Zhang, Ran Yan, Zheng Chai, Haotian Zhang, Xiao Liu

This report describes Megvii-3D team's approach towards CVPR 2021 Image Matching Workshop.

Method Towards CVPR 2021 SimLocMatch Challenge

no code implementations10 Aug 2021 Xiaopeng Bi, Ran Yan, Zheng Chai, Haotian Zhang, Xiao Liu

This report describes Megvii-3D team's approach towards SimLocMatch Challenge @ CVPR 2021 Image Matching Workshop.

SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation

no code implementations10 Aug 2021 Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, Xin Jiang

Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence.

Clone Detection Code Search +5

Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval

no code implementations5 Aug 2021 Xuri Ge, Fuhai Chen, Joemon M. Jose, Zhilong Ji, Zhongqin Wu, Xiao Liu

In this work, we propose to address the above issue from two aspects: (i) constructing intrinsic structure (along with relations) among the fragments of respective modalities, e. g., "dog $\to$ play $\to$ ball" in semantic structure for an image, and (ii) seeking explicit inter-modal structural and semantic correspondence between the visual and textual modalities.

Retrieval Semantic correspondence

UniCon: Unified Context Network for Robust Active Speaker Detection

no code implementations5 Aug 2021 Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan, Xilin Chen

Our solution is a novel, unified framework that focuses on jointly modeling multiple types of contextual information: spatial context to indicate the position and scale of each candidate's face, relational context to capture the visual relationships among the candidates and contrast audio-visual affinities with each other, and temporal context to aggregate long-term information and smooth out local uncertainties.

Audio-Visual Active Speaker Detection

Rethinking Hard-Parameter Sharing in Multi-Domain Learning

no code implementations23 Jul 2021 Lijun Zhang, Qizheng Yang, Xiao Liu, Hui Guan

One common sharing practice is to share the bottom layers of a deep neural network among domains while using separate top layers for each domain.

Fine-Grained Image Classification Multi-Task Learning

Locality-aware Channel-wise Dropout for Occluded Face Recognition

no code implementations20 Jul 2021 Mingjie He, Jie Zhang, Shiguang Shan, Xiao Liu, Zhongqin Wu, Xilin Chen

Furthermore, by randomly dropping out several feature channels, our method can well simulate the occlusion of larger area.

Face Recognition

Unsupervised Neural Rendering for Image Hazing

no code implementations14 Jul 2021 Boyun Li, Yijie Lin, Xiao Liu, Peng Hu, Jiancheng Lv, Xi Peng

To generate plausible haze, we study two less-touched but challenging problems in hazy image rendering, namely, i) how to estimate the transmission map from a single image without auxiliary information, and ii) how to adaptively learn the airlight from exemplars, i. e., unpaired real hazy images.

Image Dehazing Neural Rendering

Controllable cardiac synthesis via disentangled anatomy arithmetic

1 code implementation4 Jul 2021 Spyridon Thermos, Xiao Liu, Alison O'Neil, Sotirios A. Tsaftaris

Motivated by the ability to disentangle images into spatial anatomy (tensor) factors and accompanying imaging (vector) representations, we propose a framework termed "disentangled anatomy arithmetic", in which a generative model learns to combine anatomical factors of different input images such that when they are re-entangled with the desired imaging modality (e. g. MRI), plausible new cardiac images are created with the target characteristics.


Boost-R: Gradient Boosted Trees for Recurrence Data

no code implementations3 Jul 2021 Xiao Liu, Rong pan

Boost-R constructs an ensemble of gradient boosted additive trees to estimate the cumulative intensity function of the recurrent event process, where a new tree is added to the ensemble by minimizing the regularized L2 distance between the observed and predicted cumulative intensity.


1st Place Solutions for UG2+ Challenge 2021 -- (Semi-)supervised Face detection in the low light condition

no code implementations2 Jul 2021 Pengcheng Wang, Lingqiao Ji, Zhilong Ji, Yuan Gao, Xiao Liu

In this technical report, we briefly introduce the solution of our team "TAL-ai" for (Semi-) supervised Face detection in the low light condition in UG2+ Challenge in CVPR 2021.

Face Detection Image Enhancement +2

Multi-Granularity Network with Modal Attention for Dense Affective Understanding

no code implementations18 Jun 2021 Baoming Yan, Lin Wang, Ke Gao, Bo Gao, Xiao Liu, Chao Ban, Jiang Yang, Xiaobo Li

Video affective understanding, which aims to predict the evoked expressions by the video content, is desired for video creation and recommendation.

A Self-supervised Method for Entity Alignment

1 code implementation17 Jun 2021 Xiao Liu, Haoyun Hong, Xinghao Wang, Zeyi Chen, Evgeny Kharlamov, Yuxiao Dong, Jie Tang

We present SelfKG by leveraging this discovery to design a contrastive learning strategy across two KGs.

Contrastive Learning Entity Alignment +2

3rd Place Solution for Short-video Face Parsing Challenge

no code implementations14 Jun 2021 Xiao Liu, XiaoFei Si, Jiangtao Xie

Benefiting from the edge information and edge attention loss, the proposed EANet achieves 86. 16\% accuracy in the Short-video Face Parsing track of the 3rd Person in Context (PIC) Workshop and Challenge, ranked the third place.

Face Parsing

Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness

1 code implementation28 Apr 2021 Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding, Zhaoxiang Zhang

Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.

Image Inpainting

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser

1 code implementation18 Mar 2021 Yue Cao, Xiaohe Wu, Shuran Qi, Xiao Liu, Zhongqin Wu, WangMeng Zuo

To begin with, the pre-trained denoiser is used to generate the pseudo clean images for the test images.


GPT Understands, Too

5 code implementations18 Mar 2021 Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang

On the SuperGlue benchmark, GPTs achieve comparable and sometimes better performance to similar-sized BERTs in supervised learning.

Knowledge Probing Natural Language Understanding +1

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

2 code implementations ACL 2022 Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang

On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1. 25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

Ranked #2 on Language Modelling on WikiText-103 (using extra training data)

Abstractive Text Summarization Classification +4

Understanding WeChat User Preferences and "Wow" Diffusion

1 code implementation4 Mar 2021 Fanjin Zhang, Jie Tang, Xueyi Liu, Zhenyu Hou, Yuxiao Dong, Jing Zhang, Xiao Liu, Ruobing Xie, Kai Zhuang, Xu Zhang, Leyu Lin, Philip S. Yu

"Top Stories" is a novel friend-enhanced recommendation engine in WeChat, in which users can read articles based on preferences of both their own and their friends.

Graph Representation Learning Social and Information Networks

OAG-BERT: Towards A Unified Backbone Language Model For Academic Knowledge Services

1 code implementation3 Mar 2021 Xiao Liu, Da Yin, Jingnan Zheng, Xingjian Zhang, Peng Zhang, Hongxia Yang, Yuxiao Dong, Jie Tang

Academic knowledge services have substantially facilitated the development of the science enterprise by providing a plenitude of efficient research tools.

Language Modelling Link Prediction

A Novel Graph-based Computation Offloading Strategy for Workflow Applications in Mobile Edge Computing

no code implementations24 Feb 2021 Xuejun Li, Tianxiang Chen, Dong Yuan, Jia Xu, Xiao Liu

To achieve better Quality of Service (QoS), for instance, faster response time and lower energy consumption, computation offloading is widely used in the MEC environment.

Edge-computing Distributed, Parallel, and Cluster Computing C.2.4

GraphGallery: A Platform for Fast Benchmarking and Easy Development of Graph Neural Networks Based Intelligent Software

1 code implementation16 Feb 2021 Jintang Li, Kun Xu, Liang Chen, Zibin Zheng, Xiao Liu

Graph Neural Networks (GNNs) have recently shown to be powerful tools for representing and analyzing graph data.

Improved Signed Distance Function for 2D Real-time SLAM and Accurate Localization

no code implementations20 Jan 2021 Xingyin Fu, Zheng Fang, Xizhen Xiao, Yijia He, Xiao Liu

In this paper, we propose an improved Signed Distance Function (SDF) for both 2D SLAM and pure localization to improve the accuracy of mapping and localization.

Pose Estimation

Integrated 3C in NOMA-enabled Remote-E-Health Systems

no code implementations5 Jan 2021 Xiao Liu, Yuanwei Liu, Zhong Yang, Xinwei Yue, Chuan Wang, Yue Chen

A novel framework is proposed to integrate communication, control and computing (3C) into the fifth-generation and beyond (5GB) wireless networks for satisfying the ultra-reliable low-latency connectivity requirements of remote-e-Health systems.

News-Driven Stock Prediction Using Noisy Equity State Representation

no code implementations1 Jan 2021 Xiao Liu, Heyan Huang, Yue Zhang

News-driven stock prediction investigates the correlation between news events and stock price movements.

Stock Prediction

Unbox the Blackbox: Predict and Interpret YouTube Viewership Using Deep Learning

no code implementations21 Dec 2020 Jiaheng Xie, Xiao Liu

Although deep learning champions viewership prediction, it lacks interpretability, which is fundamental to increasing the adoption of predictive models and prescribing measurements to improve viewership.

Misinformation Video Description

SID-NISM: A Self-supervised Low-light Image Enhancement Framework

no code implementations16 Dec 2020 Lijun Zhang, Xiao Liu, Erik Learned-Miller, Hui Guan

When capturing images in low-light conditions, the images often suffer from low visibility, which not only degrades the visual aesthetics of images, but also significantly degenerates the performance of many computer vision algorithms.

Low-Light Image Enhancement

Robotic Communications for 5G and Beyond: Challenges and Research Opportunities

no code implementations9 Dec 2020 Yuanwei Liu, Xiao Liu, Xinyu Gao, Xidong Mu, Xiangwei Zhou, Octavia A. Dobre, H. Vincent Poor

Furthermore, dynamic trajectory design and resource allocation for both indoor and outdoor robots are provided to verify the performance of robotic communications in the context of typical robotic application scenarios.

Robotics Systems and Control Signal Processing Systems and Control

Intelligent Reflecting Surface Aided Multi-Cell NOMA Networks

no code implementations7 Dec 2020 Wanli Ni, Xiao Liu, Yuanwei Liu, Hui Tian, Yue Chen

This paper proposes a novel framework of resource allocation in intelligent reflecting surface (IRS) aided multi-cell non-orthogonal multiple access (NOMA) networks, where a sum-rate maximization problem is formulated.

Path Design and Resource Management for NOMA enhanced Indoor Intelligent Robots

no code implementations23 Nov 2020 Ruikang Zhong, Xiao Liu, Yuanwei Liu, Yue Chen, Xianbin Wang

Our simulation results demonstrate that 1) With the aid of NOMA techniques, the communication reliability of IRs is effectively improved; 2) The radio map is qualified to be a virtual training environment, and its statistical channel state information improves training efficiency by about 30%; 3) The proposed DT-DPG algorithm is superior to the conventional deep deterministic policy gradient (DDPG) algorithm in terms of optimization performance, training time, and anti-local optimum ability.

Management reinforcement-learning +1

Language Models are Open Knowledge Graphs

2 code implementations22 Oct 2020 Chenguang Wang, Xiao Liu, Dawn Song

This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e. g., BERT, GPT-2/3), without human supervision.

Knowledge Graphs

Multi-Agent Reinforcement Learning in NOMA-aided UAV Networks for Cellular Offloading

1 code implementation18 Oct 2020 Ruikang Zhong, Xiao Liu, Yuanwei Liu, Yue Chen

Afterward, a mutual deep Q-network (MDQN) algorithm is proposed to jointly determine the optimal 3D trajectory and power allocation of UAVs.

Multi-agent Reinforcement Learning reinforcement-learning +1

NOMA in UAV-aided cellular offloading: A machine learning approach

no code implementations18 Oct 2020 Ruikang Zhong, Xiao Liu, Yuanwei Liu, Yue Chen

A novel framework is proposed for cellular offloading with the aid of multiple unmanned aerial vehicles (UAVs), while non-orthogonal multiple access (NOMA) technique is employed at each UAV to further improve the spectrum efficiency of the wireless network.

BIG-bench Machine Learning

Difference-in-Differences: Bridging Normalization and Disentanglement in PG-GAN

no code implementations16 Oct 2020 Xiao Liu, Jiajie Zhang, Siting Li, Zuotong Wu, Yang Yu

We discover that pixel normalization causes object entanglement by in-painting the area occupied by ablated objects.


Machine Learning Empowered Trajectory and Passive Beamforming Design in UAV-RIS Wireless Networks

no code implementations6 Oct 2020 Xiao Liu, Yuanwei Liu, Yue Chen

The energy consumption minimizing problem is formulated by jointly designing the movement of the UAV, phase shifts of the RIS, power allocation policy from the UAV to MUs, as well as determining the dynamic decoding order.

BIG-bench Machine Learning Q-Learning

TP-LSD: Tri-Points Based Line Segment Detector

1 code implementation ECCV 2020 Siyu Huang, Fangbo Qin, Pengfei Xiong, Ning Ding, Yijia He, Xiao Liu

To realize one-step detection with a faster and more compact model, we introduce the tri-points representation, converting the line segment detection to the end-to-end prediction of a root-point and two endpoints for each line segment.

Line Segment Detection

Measuring the Biases and Effectiveness of Content-Style Disentanglement

2 code implementations27 Aug 2020 Xiao Liu, Spyridon Thermos, Gabriele Valvano, Agisilaos Chartsias, Alison O'Neil, Sotirios A. Tsaftaris

In this paper, we conduct an empirical study to investigate the role of different biases in content-style disentanglement settings and unveil the relationship between the degree of disentanglement and task performance.

Disentanglement Image-to-Image Translation

Disentangled Representations for Domain-generalized Cardiac Segmentation

1 code implementation26 Aug 2020 Xiao Liu, Spyridon Thermos, Agisilaos Chartsias, Alison O'Neil, Sotirios A. Tsaftaris

Robust cardiac image segmentation is still an open challenge due to the inability of the existing methods to achieve satisfactory performance on unseen data of different domains.

Anatomy Cardiac Segmentation +4

Dialogue State Induction Using Neural Latent Variable Models

1 code implementation13 Aug 2020 Qingkai Min, Libo Qin, Zhiyang Teng, Xiao Liu, Yue Zhang

Dialogue state modules are a useful component in a task-oriented dialogue system.

Reconfigurable Intelligent Surfaces: Principles and Opportunities

no code implementations7 Jul 2020 Yuanwei Liu, Xiao Liu, Xidong Mu, Tianwei Hou, Jiaqi Xu, Marco Di Renzo, Naofal Al-Dhahir

In this context, we provide a comprehensive overview of the state-of-the-art on RISs, with focus on their operating principles, performance evaluation, beamforming design and resource management, applications of machine learning to RIS-enhanced wireless networks, as well as the integration of RISs with other emerging technologies.

BIG-bench Machine Learning Management

Resource Allocation for Multi-Cell IRS-Aided NOMA Networks

no code implementations21 Jun 2020 Wanli Ni, Xiao Liu, Yuanwei Liu, Hui Tian, Yue Chen

This paper proposes a novel framework of resource allocation in multi-cell intelligent reflecting surface (IRS) aided non-orthogonal multiple access (NOMA) networks, where an IRS is deployed to enhance the wireless service.


Self-supervised Learning: Generative or Contrastive

no code implementations15 Jun 2020 Xiao Liu, Fanjin Zhang, Zhenyu Hou, Zhaoyu Wang, Li Mian, Jing Zhang, Jie Tang

As an alternative, self-supervised learning attracts many researchers for its soaring performance on representation learning in the last several years.

Graph Learning Representation Learning +1

Online non-convex learning for river pollution source identification

no code implementations22 May 2020 Wenjie Huang, Jing Jiang, Xiao Liu

In this paper, novel gradient-based online learning algorithms are developed to investigate an important environmental application: real-time river pollution source identification, which aims at estimating the released mass, location, and time of a river pollution source based on downstream sensor data monitoring the pollution concentration.

Studying Product Competition Using Representation Learning

no code implementations21 May 2020 Fanglin Chen, Xiao Liu, Davide Proserpio, Isamar Troncoso, Feiyu Xiong

We show that, compared with state-of-the-art models, our approach is faster, and can produce more accurate demand forecasts and price elasticities.

Causal Inference Decision Making +1

Neighborhood Matching Network for Entity Alignment

1 code implementation ACL 2020 Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao

This paper presents Neighborhood Matching Network (NMN), a novel entity alignment framework for tackling the structural heterogeneity challenge.

Entity Alignment Graph Sampling +1

Using Noisy Self-Reports to Predict Twitter User Demographics

1 code implementation NAACL (SocialNLP) 2021 Zach Wood-Doughty, Paiheng Xu, Xiao Liu, Mark Dredze

We present a method to identify self-reports of race and ethnicity from Twitter profile descriptions.

M^3VSNet: Unsupervised Multi-metric Multi-view Stereo Network

1 code implementation30 Apr 2020 Baichuan Huang, Hongwei Yi, Can Huang, Yijia He, Jingbin Liu, Xiao Liu

To improve the robustness and completeness of point cloud reconstruction, we propose a novel multi-metric loss function that combines pixel-wise and feature-wise loss function to learn the inherent constraints from different perspectives of matching correspondences.

Point cloud reconstruction

Have you forgotten? A method to assess if machine learning models have forgotten data

no code implementations21 Apr 2020 Xiao Liu, Sotirios A. Tsaftaris

In the era of deep learning, aggregation of data from several sources is a common approach to ensuring data diversity.

BIG-bench Machine Learning

M^3VSNet: Unsupervised Multi-metric Multi-view Stereo Network

1 code implementation21 Apr 2020 Baichuan Huang, Hongwei Yi, Can Huang, Yijia He, Jingbin Liu, Xiao Liu

To improve the robustness and completeness of point cloud reconstruction, we propose a novel multi-metric loss function that combines pixel-wise and feature-wise loss function to learn the inherent constraints from different perspectives of matching correspondences.

Point cloud reconstruction

Leveraging Planar Regularities for Point Line Visual-Inertial Odometry

no code implementations16 Apr 2020 Xin Li, Yijia He, Jinlong Lin, Xiao Liu

To improve the accuracy of 3D mesh generation and localization, we propose a tightly-coupled monocular VIO system, PLP-VIO, which exploits point features and line features as well as plane regularities.

Predictions of 2019-nCoV Transmission Ending via Comprehensive Methods

no code implementations12 Feb 2020 Tianyu Zeng, Yunong Zhang, Zhenyu Li, Xiao Liu, Binbin Qiu

Since the SARS outbreak in 2003, a lot of predictive epidemiological models have been proposed.

Artificial Intelligence Aided Next-Generation Networks Relying on UAVs

no code implementations28 Jan 2020 Xiao Liu, Mingzhe Chen, Yuanwei Liu, Yue Chen, Shuguang Cui, Lajos Hanzo

Artificial intelligence (AI) assisted unmanned aerial vehicle (UAV) aided next-generation networking is proposed for dynamic environments.

RIS Enhanced Massive Non-orthogonal Multiple Access Networks: Deployment and Passive Beamforming Design

no code implementations28 Jan 2020 Xiao Liu, Yuanwei Liu, Yue Chen, H. Vincent Poor

A novel framework is proposed for the deployment and passive beamforming design of a reconfigurable intelligent surface (RIS) with the aid of non-orthogonal multiple access (NOMA) technology.

Dynamic Instance Normalization for Arbitrary Style Transfer

no code implementations16 Nov 2019 Yongcheng Jing, Xiao Liu, Yukang Ding, Xinchao Wang, Errui Ding, Mingli Song, Shilei Wen

Prior normalization methods rely on affine transformations to produce arbitrary image style transfers, of which the parameters are computed in a pre-defined way.

Style Transfer

TruNet: Short Videos Generation from Long Videos via Story-Preserving Truncation

no code implementations14 Oct 2019 Fan Yang, Xiao Liu, Dongliang He, Chuang Gan, Jian Wang, Chao Li, Fu Li, Shilei Wen

In this work, we introduce a new problem, named as {\em story-preserving long video truncation}, that requires an algorithm to automatically truncate a long-duration video into multiple short and attractive sub-videos with each one containing an unbroken story.

Highlight Detection Video Summarization

Jointly Learning Entity and Relation Representations for Entity Alignment

1 code implementation IJCNLP 2019 Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao

Entity alignment is a viable means for integrating heterogeneous knowledge among different knowledge graphs (KGs).

Ranked #10 on Entity Alignment on DBP15k zh-en (using extra training data)

Entity Alignment Entity Embeddings +1

Image Inpainting with Learnable Bidirectional Attention Maps

1 code implementation ICCV 2019 Chaohao Xie, Shaohui Liu, Chao Li, Ming-Ming Cheng, WangMeng Zuo, Xiao Liu, Shilei Wen, Errui Ding

Most convolutional network (CNN)-based inpainting methods adopt standard convolution to indistinguishably treat valid pixels and holes, making them limited in handling irregular holes and more likely to generate inpainting results with color discrepancy and blurriness.

Image Inpainting

Deep Concept-wise Temporal Convolutional Networks for Action Localization

2 code implementations26 Aug 2019 Xin Li, Tianwei Lin, Xiao Liu, Chuang Gan, WangMeng Zuo, Chao Li, Xiang Long, Dongliang He, Fu Li, Shilei Wen

In this paper, we empirically find that stacking more conventional temporal convolution layers actually deteriorates action classification performance, possibly ascribing to that all channels of 1D feature map, which generally are highly abstract and can be regarded as latent concepts, are excessively recombined in temporal convolution.

Action Classification Action Localization

Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs

1 code implementation22 Aug 2019 Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Rui Yan, Dongyan Zhao

Entity alignment is the task of linking entities with the same real-world identity from different knowledge graphs (KGs), which has been recently dominated by embedding-based methods.

Ranked #12 on Entity Alignment on DBP15k zh-en (using extra training data)

Entity Alignment Entity Embeddings +1

BMN: Boundary-Matching Network for Temporal Action Proposal Generation

11 code implementations ICCV 2019 Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, Shilei Wen

To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map.

Action Detection Action Recognition +1

Open Domain Event Extraction Using Neural Latent Variable Models

1 code implementation ACL 2019 Xiao Liu, He-Yan Huang, Yue Zhang

We consider open domain event extraction, the task of extracting unconstraint types of events from news clusters.

Event Extraction

A Soft Label Strategy for Target-Level Sentiment Classification

no code implementations WS 2019 Da Yin, Xiao Liu, Xiuyu Wu, Baobao Chang

In this paper, we propose a soft label approach to target-level sentiment classification task, in which a history-based soft labeling model is proposed to measure the possibility of a context word as an opinion word.

Classification General Classification +1

Adapting Image Super-Resolution State-of-the-arts and Learning Multi-model Ensemble for Video Super-Resolution

no code implementations7 May 2019 Chao Li, Dongliang He, Xiao Liu, Yukang Ding, Shilei Wen

Recently, image super-resolution has been widely studied and achieved significant progress by leveraging the power of deep convolutional neural networks.

Image Super-Resolution Video Super-Resolution

STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing

4 code implementations CVPR 2019 Ming Liu, Yukang Ding, Min Xia, Xiao Liu, Errui Ding, WangMeng Zuo, Shilei Wen

Arbitrary attribute editing generally can be tackled by incorporating encoder-decoder and generative adversarial networks.


DF-SLAM: A Deep-Learning Enhanced Visual SLAM System based on Deep Local Features

no code implementations22 Jan 2019 Rong Kang, Jieqi Shi, Xueming Li, Yang Liu, Xiao Liu

As the foundation of driverless vehicle and intelligent robots, Simultaneous Localization and Mapping(SLAM) has attracted much attention these days.

Association Simultaneous Localization and Mapping

Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos

1 code implementation21 Jan 2019 Dongliang He, Xiang Zhao, Jizhou Huang, Fu Li, Xiao Liu, Shilei Wen

The task of video grounding, which temporally localizes a natural language description in a video, plays an important role in understanding videos.

Decision Making Multi-Task Learning +2

Distant Supervision for Relation Extraction with Linear Attenuation Simulation and Non-IID Relevance Embedding

no code implementations22 Dec 2018 Changsen Yuan, He-Yan Huang, Chong Feng, Xiao Liu, Xiaochi Wei

Distant supervision for relation extraction is an efficient method to reduce labor costs and has been widely used to seek novel relational facts in large corpora, which can be identified as a multi-instance multi-label problem.

Relation Extraction

StNet: Local and Global Spatial-Temporal Modeling for Action Recognition

5 code implementations5 Nov 2018 Dongliang He, Zhichao Zhou, Chuang Gan, Fu Li, Xiao Liu, Yandong Li, Li-Min Wang, Shilei Wen

In this paper, in contrast to the existing CNN+RNN or pure 3D convolution based approaches, we explore a novel spatial temporal network (StNet) architecture for both local and global spatial-temporal modeling in videos.

Action Recognition

Fine-grained Video Categorization with Redundancy Reduction Attention

no code implementations ECCV 2018 Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma

Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform.

Action Recognition

Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories

2 code implementations19 Oct 2018 Yaming Wang, Xiao Tan, Yi Yang, Ziyu Li, Xiao Liu, Feng Zhou, Larry S. Davis

Existing 3D pose datasets of object categories are limited to generic object types and lack of fine-grained information.

3D Pose Estimation Object Recognition

PANDA: AdaPtive Noisy Data Augmentation for Regularization of Undirected Graphical Models

no code implementations11 Oct 2018 Yi-Nan Li, Xiao Liu, Fang Liu

We propose an AdaPtive Noise Augmentation (PANDA) technique to regularize the estimation and construction of undirected graphical models.

Data Augmentation Variable Selection

Real-time Scholarly Retweeting Prediction System

no code implementations COLING 2018 Zhunchen Luo, Xiao Liu

At last, we combine scholarly features with the Tweet Scholar Blocks to predict whether a scholarly tweet will be retweeted.

CerfGAN: A Compact, Effective, Robust, and Fast Model for Unsupervised Multi-Domain Image-to-Image Translation

no code implementations28 May 2018 Xiao Liu, Shengchuan Zhang, Hong Liu, Xin Liu, Cheng Deng, Rongrong Ji

In principle, CerfGAN contains a novel component, i. e., a multi-class discriminator (MCD), which gives the model an extremely powerful ability to match multiple translation mappings.

Face Hallucination Image-to-Image Translation +2

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

3 code implementations CVPR 2018 Xiang Long, Chuang Gan, Gerard de Melo, Jiajun Wu, Xiao Liu, Shilei Wen

In this paper, however, we show that temporal information, especially longer-term patterns, may not be necessary to achieve competitive results on common video classification datasets.

Classification General Classification +1

Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification

no code implementations12 Aug 2017 Yunlong Bian, Chuang Gan, Xiao Liu, Fu Li, Xiang Long, Yandong Li, Heng Qi, Jie zhou, Shilei Wen, Yuanqing Lin

Experiment results on the challenging Kinetics dataset demonstrate that our proposed temporal modeling approaches can significantly improve existing approaches in the large-scale video recognition tasks.

Action Classification General Classification +2

Deep Metric Learning with Angular Loss

1 code implementation ICCV 2017 Jian Wang, Feng Zhou, Shilei Wen, Xiao Liu, Yuanqing Lin

The modern image search system requires semantic understanding of image, and a key yet under-addressed problem is to learn a good metric for measuring the similarity between images.

Image Retrieval Metric Learning

Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding

1 code implementation14 Jul 2017 Fu Li, Chuang Gan, Xiao Liu, Yunlong Bian, Xiang Long, Yandong Li, Zhichao Li, Jie zhou, Shilei Wen

This paper describes our solution for the video recognition task of the Google Cloud and YouTube-8M Video Understanding Challenge that ranked the 3rd place.

Video Recognition Video Understanding

Kernel Pooling for Convolutional Neural Networks

no code implementations CVPR 2017 Yin Cui, Feng Zhou, Jiang Wang, Xiao Liu, Yuanqing Lin, Serge Belongie

We demonstrate how to approximate kernels such as Gaussian RBF up to a given order using compact explicit feature maps in a parameter-free manner.

Face Recognition Fine-Grained Visual Categorization +2

Deep Speaker: an End-to-End Neural Speaker Embedding System

15 code implementations5 May 2017 Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li, Xuewei Zhang, Xiao Liu, Ying Cao, Ajay Kannan, Zhenyao Zhu

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity.

Speaker Identification Speaker Recognition

Dynamic Computational Time for Visual Attention

1 code implementation30 Mar 2017 Zhichao Li, Yi Yang, Xiao Liu, Feng Zhou, Shilei Wen, Wei Xu

We propose a dynamic computational time model to accelerate the average processing time for recurrent visual attention (RAM).

reinforcement-learning reinforcement Learning

Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition

no code implementations20 May 2016 Xiao Liu, Jiang Wang, Shilei Wen, Errui Ding, Yuanqing Lin

By designing a novel reward strategy, we are able to learn to locate regions that are spatially and semantically distinctive with reinforcement learning algorithm.

reinforcement-learning reinforcement Learning

Deep Embedding for Spatial Role Labeling

no code implementations28 Mar 2016 Oswaldo Ludwig, Xiao Liu, Parisa Kordjamshidi, Marie-Francine Moens

This paper introduces the visually informed embedding of word (VIEW), a continuous vector representation for a word extracted from a deep neural model trained using the Microsoft COCO data set to forecast the spatial arrangements between visual objects, given a textual description.