Search Results for author: Jing Liu

Found 207 papers, 81 papers with code

Deep Transferring Quantization

1 code implementation ECCV 2020 Zheng Xie, Zhiquan Wen, Jing Liu, Zhi-Qiang Liu, Xixian Wu, Mingkui Tan

Specifically, we propose a method named deep transferring quantization (DTQ) to effectively exploit the knowledge in a pre-trained full-precision model.

Face Recognition Image Classification +2

Learning Progressive Joint Propagation for Human Motion Prediction

no code implementations ECCV 2020 Yujun Cai, Lin Huang, Yiwei Wang, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Xu Yang, Yiheng Zhu, Xiaohui Shen, Ding Liu, Jing Liu, Nadia Magnenat Thalmann

Last, in order to incorporate a general motion space for high-quality prediction, we build a memory-based dictionary, which aims to preserve the global motion patterns in training data to guide the predictions.

Human motion prediction motion prediction

\textrm{DuReader}_{\textrm{vis}}: A Chinese Dataset for Open-domain Document Visual Question Answering

1 code implementation Findings (ACL) 2022 Le Qi, Shangwen Lv, Hongyu Li, Jing Liu, Yu Zhang, Qiaoqiao She, Hua Wu, Haifeng Wang, Ting Liu

Open-domain question answering has been used in a wide range of applications, such as web search and enterprise search, which usually takes clean texts extracted from various formats of documents (e. g., web pages, PDFs, or Word documents) as the information source.

document understanding Open-Domain Question Answering +1

基于相似度进行句子选择的机器阅读理解数据增强(Machine reading comprehension data Augmentation for sentence selection based on similarity)

no code implementations CCL 2022 Shuang Nie, Zheng Ye, Jun Qin, Jing Liu

“目前常见的机器阅读理解数据增强方法如回译, 单独对文章或者问题进行数据增强, 没有考虑文章、问题和选项三元组之间的联系。因此, 本文探索了一种利用三元组联系进行文章句子筛选的数据增强方法, 通过比较文章与问题以及选项的相似度, 选取文章中与二者联系紧密的句子。同时为了使不同选项的三元组区别增大, 我们选用了正则化Dropout的策略。实验结果表明, 在RACE数据集上的准确率可提高3. 8%。”

Data Augmentation Machine Reading Comprehension +1

Self-Evaluation of Large Language Model based on Glass-box Features

no code implementations7 Mar 2024 Hui Huang, Yingqi Qu, Jing Liu, Muyun Yang, Tiejun Zhao

The proliferation of open-source Large Language Models (LLMs) underscores the pressing need for evaluation methods.

Language Modelling Large Language Model

Enhancing Instructional Quality: Leveraging Computer-Assisted Textual Analysis to Generate In-Depth Insights from Educational Artifacts

no code implementations6 Mar 2024 Zewei Tian, Min Sun, Alex Liu, Shawon Sarkar, Jing Liu

This paper explores the transformative potential of computer-assisted textual analysis in enhancing instructional quality through in-depth insights from educational artifacts.

SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model

no code implementations28 Feb 2024 Bin Cao, Jianhao Yuan, Yexin Liu, Jian Li, Shuyang Sun, Jing Liu, Bo Zhao

To alleviate artifacts and improve quality of synthetic images, we fine-tune Vision-Language Model (VLM) as artifact classifier to automatically identify and classify a wide range of artifacts and provide supervision for further optimizing generative models.

Image Generation Language Modelling

REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering

1 code implementation27 Feb 2024 Yuhao Wang, Ruiyang Ren, Junyi Li, Wayne Xin Zhao, Jing Liu, Ji-Rong Wen

By combining the improvements in both architecture and training, our proposed REAR can better utilize external knowledge by effectively perceiving the relevance of retrieved documents.

Open-Domain Question Answering Retrieval

BASES: Large-scale Web Search User Simulation with Large Language Model based Agents

no code implementations27 Feb 2024 Ruiyang Ren, Peng Qiu, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Hua Wu, Ji-Rong Wen, Haifeng Wang

Due to the excellent capacities of large language models (LLMs), it becomes feasible to develop LLM-based agents for reliable user simulation.

Information Retrieval Language Modelling +3

CCFC++: Enhancing Federated Clustering through Feature Decorrelation

no code implementations20 Feb 2024 Jie Yan, Jing Liu, Yi-Zi Ning, Zhong-Yuan Zhang

In federated clustering, multiple data-holding clients collaboratively group data without exchanging raw data.

Clustering Contrastive Learning

Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions

1 code implementation17 Feb 2024 Wenxuan Wang, Yisi Zhang, Xingjian He, Yichen Yan, Zijia Zhao, Xinlong Wang, Jing Liu

Previous datasets and methods for classic VG task mainly rely on the prior assumption that the given expression must literally refer to the target object, which greatly impedes the practical deployment of agents in real-world scenarios.

Visual Grounding

Why Does Differential Privacy with Large Epsilon Defend Against Practical Membership Inference Attacks?

no code implementations14 Feb 2024 Andrew Lowy, Zhuohang Li, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang

In practical applications, such a worst-case guarantee may be overkill: practical attackers may lack exact knowledge of (nearly all of) the private data, and our data set might be easier to defend, in some sense, than the worst-case data set.

Inference Attack Membership Inference Attack

M2ORT: Many-To-One Regression Transformer for Spatial Transcriptomics Prediction from Histopathology Images

no code implementations19 Jan 2024 Hongyi Wang, Xiuju Du, Jing Liu, Shuyi Ouyang, Yen-Wei Chen, Lanfen Lin

To address this limit, we propose M2ORT, a many-to-one regression Transformer that can accommodate the hierarchical structure of the pathology images through a decoupled multi-scale feature extractor.

regression

CCFC: Bridging Federated Clustering and Contrastive Learning

1 code implementation12 Jan 2024 Jie Yan, Jing Liu, Zhong-Yuan Zhang

Benefiting from representation learning, the clustering performance of CCFC even double those of the best baseline methods in some cases.

Clustering Contrastive Learning +1

Temporal Adaptive RGBT Tracking with Modality Prompt

no code implementations2 Jan 2024 Hongyu Wang, Xiaotao Liu, YiFan Li, Meng Sun, Dian Yuan, Jing Liu

RGBT tracking has been widely used in various fields such as robotics, surveillance processing, and autonomous driving.

Autonomous Driving

Signed Graph Neural Ordinary Differential Equation for Modeling Continuous-time Dynamics

1 code implementation18 Dec 2023 Lanlan Chen, Kai Wu, Jian Lou, Jing Liu

Modeling continuous-time dynamics constitutes a foundational challenge, and uncovering inter-component correlations within complex systems holds promise for enhancing the efficacy of dynamic modeling.

Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation

1 code implementation13 Dec 2023 Wenxuan Wang, Tongtian Yue, Yisi Zhang, Longteng Guo, Xingjian He, Xinlong Wang, Jing Liu

To foster future research into fine-grained visual grounding, our benchmark RefCOCOm, the MRES-32M dataset and model UniRES will be publicly available at https://github. com/Rubics-Xuan/MRES

Descriptive Object +3

Efficient Stitchable Task Adaptation

1 code implementation29 Nov 2023 Haoyu He, Zizheng Pan, Jing Liu, Jianfei Cai, Bohan Zhuang

In this work, we present a novel framework, Efficient Stitchable Task Adaptation (ESTA), to efficiently produce a palette of fine-tuned models that adhere to diverse resource constraints.

Chatbot

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

1 code implementation27 Nov 2023 Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu

Remarkably, our quantization approach, for the first time, achieves model performance nearly on par with the full-precision model under 4-bit weight quantization.

Denoising Image Generation +1

Open-Vocabulary Video Anomaly Detection

no code implementations13 Nov 2023 Peng Wu, Xuerong Zhou, Guansong Pang, Yujia Sun, Jing Liu, Peng Wang, Yanning Zhang

Particularly, we devise a semantic knowledge injection module to introduce semantic knowledge from large language models for the detection task, and design a novel anomaly synthesis module to generate pseudo unseen anomaly videos with the help of large vision generation models for the classification task.

Anomaly Detection Video Anomaly Detection

An Interdisciplinary Outlook on Large Language Models for Scientific Research

no code implementations3 Nov 2023 James Boyko, Joseph Cohen, Nathan Fox, Maria Han Veiga, Jennifer I-Hsiu Li, Jing Liu, Bernardo Modenesi, Andreas H. Rauch, Kenneth N. Reid, Soumi Tribedi, Anastasia Visheratina, Xin Xie

In this paper, we describe the capabilities and constraints of Large Language Models (LLMs) within disparate academic disciplines, aiming to delineate their strengths and limitations with precision.

Med-DANet V2: A Flexible Dynamic Architecture for Efficient Medical Volumetric Segmentation

no code implementations28 Oct 2023 Haoran Shen, Yifu Zhang, Wenxuan Wang, Chen Chen, Jing Liu, Shanshan Song, Jiangyun Li

As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i. e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candidate model from the pre-defined model bank for different slices.

Computational Efficiency MRI segmentation +2

QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

no code implementations12 Oct 2023 Jing Liu, Ruihao Gong, Xiuying Wei, Zhiwei Dong, Jianfei Cai, Bohan Zhuang

Additionally, an adaptive strategy is designed to autonomously determine the optimal number of sub-channels for channel disassembly.

Quantization

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

no code implementations5 Oct 2023 Yefei He, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang

While PTQ exhibits efficiency in terms of both time and data usage, it may lead to diminished performance in low bit-width.

Denoising Image Generation +1

GLOBER: Coherent Non-autoregressive Video Generation via GLOBal Guided Video DecodER

1 code implementation NeurIPS 2023 Mingzhen Sun, Weining Wang, Zihan Qin, Jiahui Sun, Sihan Chen, Jing Liu

Specifically, we propose a video auto-encoder, where a video encoder encodes videos into global features, and a video decoder, built on a diffusion model, decodes the global features and synthesizes video frames in a non-autoregressive manner.

Video Generation

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults

no code implementations12 Sep 2023 Ahmed Adel Attia, Jing Liu, Wei Ai, Dorottya Demszky, Carol Espy-Wilson

Recent advancements in Automatic Speech Recognition (ASR) systems, exemplified by Whisper, have demonstrated the potential of these systems to approach human-level performance given sufficient data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models

no code implementations11 Sep 2023 Li Chen, Mengyi Zhao, Yiheng Liu, Mingxu Ding, Yangyang Song, Shizun Wang, Xu Wang, Hao Yang, Jing Liu, Kang Du, Min Zheng

Personalized text-to-image generation has emerged as a powerful and sought-after tool, empowering users to create customized images based on their specific concepts and prompts.

Text-to-Image Generation

Model-agnostic network inference enhancement from noisy measurements via curriculum learning

1 code implementation5 Sep 2023 Kai Wu, Yuanyuan Li, Jing Liu

Noise is a pervasive element within real-world measurement data, significantly undermining the performance of network inference models.

FG-Net: Facial Action Unit Detection with Generalizable Pyramidal Features

1 code implementation23 Aug 2023 Yufeng Yin, Di Chang, Guoxian Song, Shen Sang, Tiancheng Zhi, Jing Liu, Linjie Luo, Mohammad Soleymani

The proposed FG-Net achieves a strong generalization ability for heatmap-based AU detection thanks to the generalizable and semantic-rich features extracted from the pre-trained generative model.

Action Unit Detection Cross-corpus +1

March in Chat: Interactive Prompting for Remote Embodied Referring Expression

1 code implementation ICCV 2023 Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu

Nevertheless, this poses more challenges than other VLN tasks since it requires agents to infer a navigation plan only based on a short instruction.

Referring Expression Vision and Language Navigation

EAVL: Explicitly Align Vision and Language for Referring Image Segmentation

no code implementations18 Aug 2023 Yichen Yan, Xingjian He, Wenxuan Wang, Sihan Chen, Jing Liu

In previous approaches, fused vision-language features are directly fed into a decoder and pass through a convolution with a fixed kernel to obtain the result, which follows a similar pattern as traditional image segmentation.

Image Segmentation Referring Expression Segmentation +2

Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception

no code implementations ICCV 2023 Kun Yang, Dingkang Yang, Jingyu Zhang, Mingcheng Li, Yang Liu, Jing Liu, Hanqi Wang, Peng Sun, Liang Song

In this paper, we propose SCOPE, a novel collaborative perception framework that aggregates the spatio-temporal awareness characteristics across on-road agents in an end-to-end manner.

3D Object Detection Autonomous Vehicles +1

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model

1 code implementation24 Jul 2023 Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang

In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e. g., language descriptions and synchronous audios.

Anomaly Detection Retrieval +2

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

1 code implementation20 Jul 2023 Ruiyang Ren, Yuhao Wang, Yingqi Qu, Wayne Xin Zhao, Jing Liu, Hao Tian, Hua Wu, Ji-Rong Wen, Haifeng Wang

In this study, we present an initial analysis of the factual knowledge boundaries of LLMs and how retrieval augmentation affects LLMs on open-domain QA.

Open-Domain Question Answering Retrieval +1

Perceptual Quality Assessment of Omnidirectional Audio-visual Signals

1 code implementation20 Jul 2023 Xilei Zhu, Huiyu Duan, Yuqin Cao, Yuxin Zhu, Yucheng Zhu, Jing Liu, Li Chen, Xiongkuo Min, Guangtao Zhai

Omnidirectional videos (ODVs) play an increasingly important role in the application fields of medical, education, advertising, tourism, etc.

AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence

1 code implementation1 Jul 2023 Jiarui Wang, Huiyu Duan, Jing Liu, Shi Chen, Xiongkuo Min, Guangtao Zhai

In this paper, in order to get a better understanding of the human visual preferences for AIGIs, a large-scale IQA database for AIGC is established, which is named as AIGCIQA2023.

Image Quality Assessment Text-to-Image Generation

Stitched ViTs are Flexible Vision Backbones

1 code implementation30 Jun 2023 Zizheng Pan, Jing Liu, Haoyu He, Jianfei Cai, Bohan Zhuang

With extensive experiments on ImageNet-1K, ADE20K, COCO-Stuff-10K and NYUv2, SN-Netv2 demonstrates superior performance over SN-Netv1 on downstream dense predictions and shows strong ability as a flexible vision backbone, achieving great advantages in both training efficiency and deployment flexibility.

Description-Enhanced Label Embedding Contrastive Learning for Text Classification

1 code implementation15 Jun 2023 Kun Zhang, Le Wu, Guangyi Lv, Enhong Chen, Shulan Ruan, Jing Liu, Zhiqiang Zhang, Jun Zhou, Meng Wang

Then, we propose a novel Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.

Contrastive Learning Relation +3

COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

2 code implementations15 Jun 2023 Sihan Chen, Xingjian He, Handong Li, Xiaojie Jin, Jiashi Feng, Jing Liu

Due to the limited scale and quality of video-text training corpus, most vision-language foundation models employ image-text datasets for pretraining and primarily focus on modeling visually semantic representations while disregarding temporal semantic representations and correlations.

Question Answering Retrieval

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

1 code implementation NeurIPS 2023 Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu

Based on the proposed VAST-27M dataset, we train an omni-modality video-text foundational model named VAST, which can perceive and process vision, audio, and subtitle modalities from video, and better support various tasks including vision-text, audio-text, and multi-modal video-text tasks (retrieval, captioning and QA).

 Ranked #1 on Image Captioning on COCO Captions (SPICE metric, using extra training data)

Audio captioning Audio-Visual Captioning +14

Pre-trained transformer for adversarial purification

no code implementations27 May 2023 Kai Wu, Yujian Betterest Li, Jian Lou, XiaoYu Zhang, Handing Wang, Jing Liu

It is frightening that deep neural networks are vulnerable and sensitive to adversarial attacks, the most common one of which for the services is evasion-based.

MMNet: Multi-Mask Network for Referring Image Segmentation

no code implementations24 May 2023 Yichen Yan, Xingjian He, Wenxuan Wan, Jing Liu

However, this task is challenging due to the distinct data properties between text and image, and the randomness introduced by diverse objects and unrestricted language expression.

Image Segmentation Segmentation +1

VLAB: Enhancing Video Language Pre-training by Feature Adapting and Blending

no code implementations22 May 2023 Xingjian He, Sihan Chen, Fan Ma, Zhicheng Huang, Xiaojie Jin, Zikang Liu, Dongmei Fu, Yi Yang, Jing Liu, Jiashi Feng

Towards this goal, we propose a novel video-text pre-training method dubbed VLAB: Video Language pre-training by feature Adapting and Blending, which transfers CLIP representations to video pre-training tasks and develops unified video multimodal models for a wide range of video-text tasks.

 Ranked #1 on Visual Question Answering (VQA) on MSVD-QA (using extra training data)

Question Answering Retrieval +6

Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner

1 code implementation19 May 2023 Zikang Liu, Sihan Chen, Longteng Guo, Handong Li, Xingjian He, Jing Liu

In this paper, we propose a novel method called Joint QA and DC GEneration (JADE), which utilizes a pre-trained multimodal model and easily-crawled image-text pairs to automatically generate and filter large-scale VQA and dense captioning datasets.

Dense Captioning Image Captioning +4

CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image Segmentation

no code implementations19 May 2023 Wenxuan Wang, Jing Liu, Xingjian He, Yisi Zhang, Chen Chen, Jiachen Shen, Yan Zhang, Jiangyun Li

Referring image segmentation (RIS) is a fundamental vision-language task that intends to segment a desired object from an image based on a given natural language expression.

Image Segmentation Segmentation +1

TOME: A Two-stage Approach for Model-based Retrieval

no code implementations18 May 2023 Ruiyang Ren, Wayne Xin Zhao, Jing Liu, Hua Wu, Ji-Rong Wen, Haifeng Wang

Recently, model-based retrieval has emerged as a new paradigm in text retrieval that discards the index in the traditional retrieval model and instead memorizes the candidate corpora using model parameters.

Natural Questions Retrieval +1

Configurable Spatial-Temporal Hierarchical Analysis for Flexible Video Anomaly Detection

no code implementations12 May 2023 Kai Cheng, Xinhua Zeng, Yang Liu, Tian Wang, Chengxin Pang, Jing Teng, Zhaoyang Xia, Jing Liu

Since the anomaly set is complicated and unbounded, our STHA can adjust its detection ability to adapt to the human detection demands and the complexity degree of anomaly that happened in the history of a scene.

Anomaly Detection Human Detection +2

SLSG: Industrial Image Anomaly Detection by Learning Better Feature Embeddings and One-Class Classification

no code implementations30 Apr 2023 Minghui Yang, Jing Liu, Zhiwei Yang, Zhaoyang Wu

Focusing on more effective and comprehensive anomaly detection, we propose a network based on self-supervised learning and self-attentive graph convolution (SLSG) for anomaly detection.

Classification One-Class Classification +1

B2Opt: Learning to Optimize Black-box Optimization with Little Budget

no code implementations24 Apr 2023 XiaoBin Li, Kai Wu, XiaoYu Zhang, Handing Wang, Jing Liu

To achieve this, 1) drawing on the mechanism of genetic algorithm, we propose a deep neural network framework called B2Opt, which has a stronger representation of optimization strategies based on survival of the fittest; 2) B2Opt can utilize the cheap surrogate functions of the target task to guide the design of the efficient optimization strategies.

Med-Tuning: Parameter-Efficient Transfer Learning with Fine-Grained Feature Enhancement for Medical Volumetric Segmentation

no code implementations21 Apr 2023 Wenxuan Wang, Jiachen Shen, Chen Chen, Jianbo Jiao, Jing Liu, Yan Zhang, Shanshan Song, Jiangyun Li

In this paper, we present the study on parameter-efficient transfer learning for medical volumetric segmentation and propose a new framework named Med-Tuning based on intra-stage feature enhancement and inter-stage feature interaction.

Segmentation Transfer Learning

DECN: Automated Evolutionary Algorithms via Evolution Inspired Deep Convolution Network

no code implementations19 Apr 2023 Kai Wu, Penghui Liu, Jing Liu

Evolutionary algorithms (EAs) have emerged as a powerful framework for optimization, especially for black-box optimization.

Evolutionary Algorithms Meta-Learning

VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

1 code implementation17 Apr 2023 Sihan Chen, Xingjian He, Longteng Guo, Xinxin Zhu, Weining Wang, Jinhui Tang, Jing Liu

Different from widely-studied vision-language pretraining models, VALOR jointly models relationships of vision, audio and language in an end-to-end manner.

 Ranked #1 on Video Captioning on VATEX (using extra training data)

Audio captioning Audio-Video Question Answering (AVQA) +16

Calibrating Cross-modal Features for Text-Based Person Searching

no code implementations5 Apr 2023 Donglai Wei, Sipeng Zhang, Tong Yang, Yang Liu, Jing Liu

On the other hand, the Masking Caption Modeling (MCM) loss leverages a masked captions prediction task to establish detailed and generic relationships between textual and visual parts.

Person Search Text based Person Search

PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers

no code implementations30 Mar 2023 Rahul Pandey, Roger Ren, Qi Luo, Jing Liu, Ariya Rastrow, Ankur Gandhe, Denis Filimonov, Grant Strimel, Andreas Stolcke, Ivan Bulyko

End-to-End (E2E) automatic speech recognition (ASR) systems used in voice assistants often have difficulties recognizing infrequent words personalized to the user, such as names and places.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Sounding Video Generator: A Unified Framework for Text-guided Sounding Video Generation

1 code implementation29 Mar 2023 Jiawei Liu, Weining Wang, Sihan Chen, Xinxin Zhu, Jing Liu

In this work, we concentrate on a rarely investigated problem of text guided sounding video generation and propose the Sounding Video Generator (SVG), a unified framework for generating realistic videos along with audio signals.

Audio Generation Contrastive Learning +1

OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis

no code implementations CVPR 2023 Hongyi Xu, Guoxian Song, Zihang Jiang, Jianfeng Zhang, Yichun Shi, Jing Liu, WanChun Ma, Jiashi Feng, Linjie Luo

We present OmniAvatar, a novel geometry-guided 3D head synthesis model trained from in-the-wild unstructured images that is capable of synthesizing diverse identity-preserved 3D heads with compelling dynamic details under full disentangled control over camera poses, facial expressions, head shapes, articulated neck and jaw poses.

AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer Learning

no code implementations24 Mar 2023 Guoxian Song, Hongyi Xu, Jing Liu, Tiancheng Zhi, Yichun Shi, Jianfeng Zhang, Zihang Jiang, Jiashi Feng, Shen Sang, Linjie Luo

Capitalizing on the recent advancement of 3D-aware GAN models, we perform \emph{guided transfer learning} on a pretrained 3D GAN generator to produce multi-view-consistent stylized renderings.

Transfer Learning

Boosting Verified Training for Robust Image Classifications via Abstraction

1 code implementation CVPR 2023 Zhaodi Zhang, Zhiyi Xue, Yang Chen, Si Liu, Yueling Zhang, Jing Liu, Min Zhang

Via abstraction, all perturbed images are mapped into intervals before feeding into neural networks for training.

Subjective and Objective Quality Assessment for in-the-Wild Computer Graphics Images

1 code implementation14 Mar 2023 ZiCheng Zhang, Wei Sun, Yingjie Zhou, Jun Jia, Zhichao Zhang, Jing Liu, Xiongkuo Min, Guangtao Zhai

Computer graphics images (CGIs) are artificially generated by means of computer programs and are widely perceived under various scenarios, such as games, streaming media, etc.

Image Quality Assessment NR-IQA

MOSO: Decomposing MOtion, Scene and Object for Video Prediction

2 code implementations CVPR 2023 Mingzhen Sun, Weining Wang, Xinxin Zhu, Jing Liu

Experimental results demonstrate that our method achieves new state-of-the-art performance on five challenging benchmarks for video prediction and unconditional video generation: BAIR, RoboNet, KTH, KITTI and UCF101.

Object Unconditional Video Generation +2

SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases

no code implementations28 Feb 2023 Yanchen Liu, Jing Yan, Yan Chen, Jing Liu, Hua Wu

Recent studies reveal that various biases exist in different NLP tasks, and over-reliance on biases results in models' poor generalization ability and low adversarial robustness.

Adversarial Robustness Natural Language Inference +1

Graph-based Knowledge Distillation: A survey and experimental evaluation

1 code implementation27 Feb 2023 Jing Liu, Tongya Zheng, Guanzheng Zhang, Qinfen Hao

It then provides a comprehensive summary of three types of Graph-based Knowledge Distillation methods, namely Graph-based Knowledge Distillation for deep neural networks (DKD), Graph-based Knowledge Distillation for GNNs (GKD), and Self-Knowledge Distillation based Graph-based Knowledge Distillation (SKD).

Self-Knowledge Distillation

A novel efficient Multi-view traffic-related object detection framework

no code implementations23 Feb 2023 Kun Yang, Jing Liu, Dingkang Yang, Hanqi Wang, Peng Sun, Yanni Zhang, Yan Liu, Liang Song

With the rapid development of intelligent transportation system applications, a tremendous amount of multi-view video data has emerged to enhance vehicle perception.

Model Selection object-detection +1

Tag-based annotation creates better avatars

no code implementations14 Feb 2023 Minghao Liu, Zeyu Cheng, Shen Sang, Jing Liu, James Davis

Compared to direct annotation of labels, the proposed method: produces higher annotator agreements, causes machine learning to generates more consistent predictions, and only requires a marginal cost to add new rendering systems.

TAG

Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models

1 code implementation10 Feb 2023 Yang Liu, Dingkang Yang, Yan Wang, Jing Liu, Jun Liu, Azzedine Boukerche, Peng Sun, Liang Song

Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems, enabling the temporal or spatial identification of anomalous events within videos.

Anomaly Detection Event Detection +1

A Survey on Efficient Training of Transformers

no code implementations2 Feb 2023 Bohan Zhuang, Jing Liu, Zizheng Pan, Haoyu He, Yuetian Weng, Chunhua Shen

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources.

Discover governing differential equations from evolving systems

no code implementations19 Jan 2023 Yuanyuan Li, Kai Wu, Jing Liu

Our proposal is competitive in identifying the change points and discovering governing differential equations in three hybrid systems and two switching linear systems.

BiViT: Extremely Compressed Binary Vision Transformers

no code implementations ICCV 2023 Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang

To solve this, we propose Softmax-aware Binarization, which dynamically adapts to the data distribution and reduces the error caused by binarization.

Binarization object-detection +1

LoTE-Animal: A Long Time-span Dataset for Endangered Animal Behavior Understanding

no code implementations ICCV 2023 Dan Liu, Jin Hou, Shaoli Huang, Jing Liu, Yuxin He, Bochuan Zheng, Jifeng Ning, Jingdong Zhang

To break the deadlock, we present LoTE-Animal, a large-scale endangered animal dataset collected over 12 years, to foster the application of deep learning in rare species conservation.

Action Recognition Domain Adaptation +5

Enhanced-rate Iterative Beamformers for Active IRS-assisted Wireless Communications

no code implementations16 Dec 2022 Yeqing Lin, Feng Shu, Rongen Dong, Riqing Chen, Siling Feng, Weiping Shi, Jing Liu, Jiangzhou Wang

In this paper, in order to boost the achievable rate of user in such a wireless network, three enhanced-rate iterative beamforming methods are proposed by designing the amplifying factors and the corresponding phases at active IRS.

Three High-rate Beamforming Methods for Active IRS-aided Wireless Network

no code implementations5 Dec 2022 Feng Shu, Jing Liu, Yeqing Lin, Yang Liu, Zhilin Chen, Xuehui Wang, Rongen Dong, Jiangzhou Wang

To fully exploit the amplifying gain achieved by active IRS, two high-rate methods, maximum ratio reflecting (MRR) and selective ratio reflecting (SRR) are presented, which are motivated by maximum ratio combining and selective ratio combining.

Vocal Bursts Intensity Prediction

Privacy-Preserving Federated Deep Clustering based on GAN

no code implementations30 Nov 2022 Jie Yan, Jing Liu, Ji Qi, Zhong-Yuan Zhang

Federated clustering (FC) is an essential extension of centralized clustering designed for the federated setting, wherein the challenge lies in constructing a global similarity measure without the need to share private data.

Clustering Deep Clustering +4

Higher-order Knowledge Transfer for Dynamic Community Detection with Great Changes

no code implementations28 Nov 2022 Huixin Ma, Kai Wu, Handing Wang, Jing Liu

In this way, our proposal can better keep the advantages of previous community detection results and transfer them to the next task.

Community Detection Dynamic Community Detection +1

Dense Text Retrieval based on Pretrained Language Models: A Survey

2 code implementations27 Nov 2022 Wayne Xin Zhao, Jing Liu, Ruiyang Ren, Ji-Rong Wen

With powerful PLMs, we can effectively learn the representations of queries and texts in the latent representation space, and further construct the semantic matching function between the dense vectors for relevance modeling.

Retrieval Text Retrieval

AgileAvatar: Stylized 3D Avatar Creation via Cascaded Domain Bridging

no code implementations15 Nov 2022 Shen Sang, Tiancheng Zhi, Guoxian Song, Minghao Liu, Chunpong Lai, Jing Liu, Xiang Wen, James Davis, Linjie Luo

We propose a novel self-supervised learning framework to create high-quality stylized 3D avatars with a mix of continuous and discrete parameters.

Self-Supervised Learning

BiViT: Extremely Compressed Binary Vision Transformer

no code implementations14 Nov 2022 Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang

To solve this, we propose Softmax-aware Binarization, which dynamically adapts to the data distribution and reduces the error caused by binarization.

Binarization object-detection +1

LGN-Net: Local-Global Normality Network for Video Anomaly Detection

1 code implementation14 Nov 2022 Mengyang Zhao, Xinhua Zeng, Yang Liu, Jing Liu, Di Li, Xing Hu, Chengxin Pang

Existing unsupervised VAD methods tend to learn normality from training sets consisting of only normal videos and regard instances deviating from such normality as anomalies.

Anomaly Detection Video Anomaly Detection

Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

2 code implementations7 Nov 2022 Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, Jingang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, Jinwoo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li, Dan Zhu, Mengdi Sun, Ran Duan, Yan Gao, Lingshun Kong, Long Sun, Xiang Li, Xingdong Zhang, Jiawei Zhang, Yaqi Wu, Jinshan Pan, Gaocheng Yu, Jin Zhang, Feng Zhang, Zhe Ma, Hongbin Wang, Hojin Cho, Steve Kim, Huaen Li, Yanbo Ma, Ziwei Luo, Youwei Li, Lei Yu, Zhihong Wen, Qi Wu, Haoqiang Fan, Shuaicheng Liu, Lize Zhang, Zhikai Zong, Jeremy Kwon, Junxi Zhang, Mengyuan Li, Nianxiang Fu, Guanchen Ding, Han Zhu, Zhenzhong Chen, Gen Li, Yuanfan Zhang, Lei Sun, Dafeng Zhang, Neo Yang, Fitz Liu, Jerry Zhao, Mustafa Ayazoglu, Bahri Batuhan Bilecen, Shota Hirose, Kasidis Arunruangsirilert, Luo Ao, Ho Chun Leung, Andrew Wei, Jie Liu, Qiang Liu, Dahai Yu, Ao Li, Lei Luo, Ce Zhu, Seongmin Hong, Dongwon Park, Joonhee Lee, Byeong Hyun Lee, Seunggyu Lee, Se Young Chun, Ruiyuan He, Xuhao Jiang, Haihang Ruan, Xinjian Zhang, Jing Liu, Garas Gendy, Nabil Sabor, Jingchao Hou, Guanghui He

While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints.

Image Super-Resolution

Federated clustering with GAN-based data synthesis

1 code implementation29 Oct 2022 Jie Yan, Jing Liu, Ji Qi, Zhong-Yuan Zhang

Federated clustering (FC) is an extension of centralized clustering in federated settings.

Clustering Federated Learning +1

MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning

no code implementations9 Oct 2022 Zijia Zhao, Longteng Guo, Xingjian He, Shuai Shao, Zehuan Yuan, Jing Liu

Our method performs joint masking on image-text input and integrates both implicit and explicit targets for the masked signals to recover.

Question Answering Representation Learning +5

EcoFormer: Energy-Saving Attention with Linear Complexity

1 code implementation19 Sep 2022 Jing Liu, Zizheng Pan, Haoyu He, Jianfei Cai, Bohan Zhuang

To this end, we propose a new binarization paradigm customized to high-dimensional softmax attention via kernelized hashing, called EcoFormer, to map the original queries and keys into low-dimensional binary codes in Hamming space.

Binarization

FocusFormer: Focusing on What We Need via Architecture Sampler

no code implementations23 Aug 2022 Jing Liu, Jianfei Cai, Bohan Zhuang

During architecture search, these methods focus on finding architectures on the Pareto frontier of performance and resource consumption, which forms a gap between training and deployment.

Neural Architecture Search

Provably Tightest Linear Approximation for Robustness Verification of Sigmoid-like Neural Networks

no code implementations21 Aug 2022 Zhaodi Zhang, Yiting Wu, Si Liu, Jing Liu, Min Zhang

Considerable efforts have been devoted to finding the so-called tighter approximations to obtain more precise verification results.

An Interpretability Evaluation Benchmark for Pre-trained Language Models

no code implementations28 Jul 2022 Yaozong Shen, Lijie Wang, Ying Chen, Xinyan Xiao, Jing Liu, Hua Wu

To fill in the gap, we propose a novel evaluation benchmark providing with both English and Chinese annotated data.

HIRE: Distilling High-order Relational Knowledge From Heterogeneous Graph Neural Networks

no code implementations25 Jul 2022 Jing Liu, Tongya Zheng, Qinfen Hao

To the best of our knowledge, we are the first to propose a HIgh-order RElational (HIRE) knowledge distillation framework on heterogeneous graphs, which can significantly boost the prediction performance regardless of model architectures of HGNNs.

Knowledge Distillation Vocal Bursts Intensity Prediction

Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection

1 code implementation22 Jul 2022 Zhiwei Yang, Peng Wu, Jing Liu, Xiaotao Liu

Existing methods for anomaly detection based on memory-augmented autoencoder (AE) have the following drawbacks: (1) Establishing a memory bank requires additional memory space.

Anomaly Detection

Reducing US Biofuels Requirements Mitigates Short-term Impacts of Global Population and Income Growth on Agricultural Environmental Outcomes

no code implementations28 Jun 2022 David R. Johnson, Nathan B. Geldner, Jing Liu, Uris Lantz Baldos, Thomas Hertel

Biobased energy, particularly corn starch-based ethanol and other liquid renewable fuels, are a major element of federal and state energy policies in the United States.

Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation

1 code implementation25 May 2022 Yanrui Du, Jing Yan, Yan Chen, Jing Liu, Sendong Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang, Bing Qin

In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution of training data.

Natural Language Inference Sentiment Analysis

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

MemSeg: A semi-supervised method for image surface defect detection using differences and commonalities

4 code implementations2 May 2022 Minghui Yang, Peng Wu, Jing Liu, Hui Feng

By comparing the similarities and differences between input samples and memory samples in the memory pool to give effective guesses about abnormal regions; In the inference phase, MemSeg directly determines the abnormal regions of the input image in an end-to-end manner.

Anomaly Detection Defect Detection +1

A Thorough Examination on Zero-shot Dense Retrieval

no code implementations27 Apr 2022 Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qifei Wu, Yuchen Ding, Hua Wu, Haifeng Wang, Ji-Rong Wen

Recent years have witnessed the significant advance in dense retrieval (DR) based on powerful pre-trained language models (PLM).

Retrieval

A Multi-Transformation Evolutionary Framework for Influence Maximization in Social Networks

1 code implementation7 Apr 2022 Chao Wang, Jiaxuan Zhao, Lingling Li, Licheng Jiao, Jing Liu, Kai Wu

Influence maximization is a crucial issue for mining the deep information of social networks, which aims to select a seed set from the network to maximize the number of influenced nodes.

Dynamic Focus-aware Positional Queries for Semantic Segmentation

2 code implementations CVPR 2023 Haoyu He, Jianfei Cai, Zizheng Pan, Jing Liu, Jing Zhang, DaCheng Tao, Bohan Zhuang

In this paper, we propose a simple yet effective query design for semantic segmentation termed Dynamic Focus-aware Positional Queries (DFPQ), which dynamically generates positional queries conditioned on the cross-attention scores from the preceding decoder block and the positional encodings for the corresponding image features, simultaneously.

Semantic Segmentation

Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding

no code implementations1 Apr 2022 Xuandi Fu, Feng-Ju Chang, Martin Radfar, Kai Wei, Jing Liu, Grant P. Strimel, Kanthashree Mysore Sathyendra

In addition, the NLU model in the two-stage system is not streamable, as it must wait for the audio segments to complete processing, which ultimately impacts the latency of the SLU system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Evolutionary Multitasking AUC Optimization

1 code implementation4 Jan 2022 Chao Wang, Kai Wu, Jing Liu

Inspired by the characteristic of pairwise learning, the cheap AUC optimization task with a small-scale dataset sampled from the large-scale dataset is constructed to promote the AUC accuracy of the original, large-scale, and expensive AUC optimization task.

Binary Classification

Network Collaborator: Knowledge Transfer Between Network Reconstruction and Community Detection

1 code implementation4 Jan 2022 Kai Wu, Chao Wang, Junyuan Chen, Jing Liu

Community detection (CD) from dynamics and network reconstruction (NR) from dynamics are natural synergistic tasks that motivate the proposed evolutionary multitasking NR and CD framework, called network collaborator (NC).

Community Detection Transfer Learning

DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models

1 code implementation16 Dec 2021 Hongyu Zhu, Yan Chen, Jing Yan, Jing Liu, Yu Hong, Ying Chen, Hua Wu, Haifeng Wang

For this purpose, we create a Chinese dataset namely DuQM which contains natural questions with linguistic perturbations to evaluate the robustness of question matching models.

Natural Questions

Sharpness-aware Quantization for Deep Neural Networks

3 code implementations24 Nov 2021 Jing Liu, Jianfei Cai, Bohan Zhuang

However, the abrupt changes in quantized weights during training often lead to severe loss fluctuations and result in a sharp loss landscape, making the gradients unstable and thus degrading the performance.

Image Classification Model Compression +1

Pruning Self-attentions into Convolutional Layers in Single Path

3 code implementations23 Nov 2021 Haoyu He, Jianfei Cai, Jing Liu, Zizheng Pan, Jing Zhang, DaCheng Tao, Bohan Zhuang

Relying on the single-path space, we introduce learnable binary gates to encode the operation choices in MSA layers.

Inductive Bias Neural Architecture Search

Mesa: A Memory-saving Training Framework for Transformers

3 code implementations22 Nov 2021 Zizheng Pan, Peng Chen, Haoyu He, Jing Liu, Jianfei Cai, Bohan Zhuang

While Transformers have delivered significant performance improvements, training such networks is extremely memory intensive owing to storing all intermediate activations that are needed for gradient computation during backpropagation, especially for long sequences.

Quantization

RVFR: Robust Vertical Federated Learning via Feature Subspace Recovery

no code implementations29 Sep 2021 Jing Liu, Chulin Xie, Krishnaram Kenthapadi, Oluwasanmi O Koyejo, Bo Li

Vertical Federated Learning (VFL) is a distributed learning paradigm that allows multiple agents to jointly train a global model when each agent holds a different subset of features for the same sample(s).

Vertical Federated Learning

Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing

no code implementations6 Sep 2021 Xingjian He, Weining Wang, Zhiyong Xu, Hao Wang, Jie Jiang, Jing Liu

Compared with image scene parsing, video scene parsing introduces temporal information, which can effectively improve the consistency and accuracy of prediction.

Scene Parsing

Resisting Out-of-Distribution Data Problem in Perturbation of XAI

no code implementations27 Jul 2021 Luyu Qiu, Yi Yang, Caleb Chen Cao, Jing Liu, Yueyuan Zheng, Hilary Hei Ting Ngai, Janet Hsiao, Lei Chen

Besides, our solution also resolves a fundamental problem with the faithfulness indicator, a commonly used evaluation metric of XAI algorithms that appears to be sensitive to the OoD issue.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI)

HANet: Hierarchical Alignment Networks for Video-Text Retrieval

1 code implementation26 Jul 2021 Peng Wu, Xiangteng He, Mingqian Tang, Yiliang Lv, Jing Liu

Based on these, we naturally construct hierarchical representations in the individual-local-global manner, where the individual level focuses on the alignment between frame and word, local level focuses on the alignment between video clip and textual context, and global level focuses on the alignment between the whole video and text.

Retrieval Text Matching +3

AgileGAN: stylizing portraits by inversion-consistent transfer learning

1 code implementation ACM Transactions on Graphics 2021 Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chun-Pong Lai, Chuanxia Zheng, Tat-Jen Cham

While substantial progress has been made in automated stylization, generating high quality stylistic portraits is still a challenge, and even the recent popular Toonify suffers from several artifacts when used on real input images.

Attribute motion retargeting +1

OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation

2 code implementations1 Jul 2021 Jing Liu, Xinxin Zhu, Fei Liu, Longteng Guo, Zijia Zhao, Mingzhen Sun, Weining Wang, Hanqing Lu, Shiyu Zhou, Jiajun Zhang, Jinqiao Wang

In this paper, we propose an Omni-perception Pre-Trainer (OPT) for cross-modal understanding and generation, by jointly modeling visual, text and audio resources.

Audio to Text Retrieval Cross-Modal Retrieval +3

Tensor networks for unsupervised machine learning

1 code implementation24 Jun 2021 Jing Liu, Sujie Li, Jiang Zhang, Pan Zhang

Despite the great potential, however, existing tensor network models for unsupervised machine learning only work as a proof of principle, as their performance is much worse than the standard models such as restricted Boltzmann machines and neural networks.

BIG-bench Machine Learning Tensor Networks

Exploiting Large-scale Teacher-Student Training for On-device Acoustic Models

no code implementations11 Jun 2021 Jing Liu, Rupak Vignesh Swaminathan, Sree Hari Krishnan Parthasarathi, Chunchuan Lyu, Athanasios Mouchtaris, Siegfried Kunzmann

We present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 hours of GPU time, making our study one of the largest of its kind.

Measuring Conversational Uptake: A Case Study on Student-Teacher Interactions

1 code implementation ACL 2021 Dorottya Demszky, Jing Liu, Zid Mancenido, Julie Cohen, Heather Hill, Dan Jurafsky, Tatsunori Hashimoto

In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said.

Math Question Answering

Large-Scale Data-Driven Airline Market Influence Maximization

no code implementations31 May 2021 Duanshun Li, Jing Liu, Jinsung Jeon, Seoyoung Hong, Thai Le, Dongwon Lee, Noseong Park

On top of the prediction models, we define a budget-constrained flight frequency optimization problem to maximize the market influence over 2, 262 routes.

Boosting the Performance of Video Compression Artifact Reduction with Reference Frame Proposals and Frequency Domain Information

no code implementations31 May 2021 Yi Xu, Minyi Zhao, Jing Liu, Xinjian Zhang, Longwen Gao, Shuigeng Zhou, Huyang Sun

Many deep learning based video compression artifact removal algorithms have been proposed to recover high-quality videos from low-quality compressed videos.

Video Compression

Less is More: Pay Less Attention in Vision Transformers

2 code implementations29 May 2021 Zizheng Pan, Bohan Zhuang, Haoyu He, Jing Liu, Jianfei Cai

Transformers have become one of the dominant architectures in deep learning, particularly as a powerful alternative to convolutional neural networks (CNNs) in computer vision.

Image Classification Instance Segmentation +3

AAformer: Auto-Aligned Transformer for Person Re-Identification

no code implementations2 Apr 2021 Kuan Zhu, Haiyun Guo, Shiliang Zhang, YaoWei Wang, Gaopan Huang, Honglin Qiao, Jing Liu, Jinqiao Wang, Ming Tang

In this paper, we introduce an alignment scheme in Transformer architecture for the first time and propose the Auto-Aligned Transformer (AAformer) to automatically locate both the human parts and non-human ones at patch-level.

Human Parsing Image Classification +3

Scalable Vision Transformers with Hierarchical Pooling

2 code implementations ICCV 2021 Zizheng Pan, Bohan Zhuang, Jing Liu, Haoyu He, Jianfei Cai

However, the routine of the current ViT model is to maintain a full-length patch sequence during inference, which is redundant and lacks hierarchical representation.

Image Classification

Temporal Memory Attention for Video Semantic Segmentation

1 code implementation17 Feb 2021 Hao Wang, Weining Wang, Jing Liu

Video semantic segmentation requires to utilize the complex temporal relations between frames of the video sequence.

Segmentation Semantic Segmentation +1

CPTR: Full Transformer Network for Image Captioning

no code implementations26 Jan 2021 Wei Liu, Sihan Chen, Longteng Guo, Xinxin Zhu, Jing Liu

Besides, we provide detailed visualizations of the self-attention between patches in the encoder and the "words-to-patches" attention in the decoder thanks to the full Transformer architecture.

Image Captioning

Global-Local Propagation Network for RGB-D Semantic Segmentation

no code implementations26 Jan 2021 Sihan Chen, Xinxin Zhu, Wei Liu, Xingjian He, Jing Liu

Depth information matters in RGB-D semantic segmentation task for providing additional geometric information to color images.

Scene Segmentation Segmentation

Fast Sequence Generation with Multi-Agent Reinforcement Learning

no code implementations24 Jan 2021 Longteng Guo, Jing Liu, Xinxin Zhu, Hanqing Lu

These models are autoregressive in that they generate each word by conditioning on previously generated words, which leads to heavy latency during inference.

Image Captioning Machine Translation +5

Single-path Bit Sharing for Automatic Loss-aware Model Compression

no code implementations13 Jan 2021 Jing Liu, Bohan Zhuang, Peng Chen, Chunhua Shen, Jianfei Cai, Mingkui Tan

By jointly training the binary gates in conjunction with network parameters, the compression configurations of each layer can be automatically determined.

Model Compression Network Pruning +1

HAIR: Hierarchical Visual-Semantic Relational Reasoning for Video Question Answering

no code implementations ICCV 2021 Fei Liu, Jing Liu, Weining Wang, Hanqing Lu

Specifically, we present a novel graph memory mechanism to perform relational reasoning, and further develop two types of graph memory: a) visual graph memory that leverages visual information of video for relational reasoning; b) semantic graph memory that is specifically designed to explicitly leverage semantic knowledge contained in the classes and attributes of video objects, and perform relational reasoning in the semantic space.

Question Answering Relational Reasoning +1

Magnetic field and gravitational waves from the first-order Phase Transition

no code implementations31 Dec 2020 Yuefeng Di, Jialong Wang, Ruiyu Zhou, Ligong Bian, Rong-Gen Cai, Jing Liu

We perform the three dimensional lattice simulation of the magnetic field and gravitational wave productions from bubble collisions during the first-order electroweak phase transition.

Cosmology and Nongalactic Astrophysics High Energy Physics - Lattice High Energy Physics - Phenomenology

AutoCaption: Image Captioning with Neural Architecture Search

no code implementations16 Dec 2020 Xinxin Zhu, Weining Wang, Longteng Guo, Jing Liu

The whole process involves a visual understanding module and a language generation module, which brings more challenges to the design of deep neural networks than other tasks.

Image Captioning Neural Architecture Search +1

DS-Net: Dynamic Spatiotemporal Network for Video Salient Object Detection

1 code implementation9 Dec 2020 Jing Liu, Jiaxiang Wang, Weikang Wang, Yuting Su

In this paper, we investigate the complimentary roles of spatial and temporal information and propose a novel dynamic spatiotemporal network (DS-Net) for more effective fusion of spatiotemporal information.

object-detection Optical Flow Estimation +3

Conditional Automated Channel Pruning for Deep Neural Networks

no code implementations21 Sep 2020 Yixin Liu, Yong Guo, Zichang Liu, Haohua Liu, Jingjie Zhang, Zejun Chen, Jing Liu, Jian Chen

To address this issue, given a target compression rate for the whole model, one can search for the optimal compression rate for each layer.

Model Compression

Robust Mean Estimation in High Dimensions via $\ell_0$ Minimization

no code implementations21 Aug 2020 Jing Liu, Aditya Deshmukh, Venugopal V. Veeravalli

We study the robust mean estimation problem in high dimensions, where $\alpha <0. 5$ fraction of the data points can be arbitrarily corrupted.

Compressive Sensing Vocal Bursts Intensity Prediction

Scene Segmentation with Dual Relation-aware Attention Network

1 code implementation TNNLS 2020 Jun Fu, Jing Liu, Jie Jiang, Yong Li, Yongjun Bao, Hanqing Lu

We conduct extensive experiments to validate the effectiveness of our network and achieve new state-of-the-art segmentation performance on four challenging scene segmentation data sets, i. e., Cityscapes, ADE20K, PASCAL Context, and COCO Stuff data sets.

Relation Scene Segmentation +1

AQD: Towards Accurate Fully-Quantized Object Detection

1 code implementation CVPR 2021 Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen

Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices.

Image Classification Object +3

Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning

no code implementations10 May 2020 Longteng Guo, Jing Liu, Xinxin Zhu, Xingjian He, Jie Jiang, Hanqing Lu

In this paper, we propose a Non-Autoregressive Image Captioning (NAIC) model with a novel training paradigm: Counterfactuals-critical Multi-Agent Learning (CMAL).

Image Captioning Machine Translation +3

Gumbel-softmax-based Optimization: A Simple General Framework for Optimization Problems on Graphs

no code implementations14 Apr 2020 Yaoxin Li, Jing Liu, Guozheng Lin, Yueyuan Hou, Muyun Mou, Jiang Zhang

In computer science, there exist a large number of optimization problems defined on graphs, that is to find a best node state configuration or a network structure such that the designed objective function is optimized under some constraints.

Combinatorial Optimization

Micro-supervised Disturbance Learning: A Perspective of Representation Probability Distribution

no code implementations13 Mar 2020 Jielei Chu, Jing Liu, Hongjun Wang, Meng Hua, Zhiguo Gong, Tianrui Li

To explore the representation learning capability under the continuous stimulation of the SPI, we present a deep Micro-supervised Disturbance Learning (Micro-DL) framework based on the Micro-DGRBM and Micro-DRBM models and compare it with a similar deep structure which has not any external stimulation.

Representation Learning

Generative Low-bitwidth Data Free Quantization

3 code implementations ECCV 2020 Shoukai Xu, Haokun Li, Bohan Zhuang, Jing Liu, JieZhang Cao, Chuangrun Liang, Mingkui Tan

More critically, our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method.

Data Free Quantization

Learning to Generate Time Series Conditioned Graphs with Generative Adversarial Nets

no code implementations3 Mar 2020 Shanchao Yang, Jing Liu, Kai Wu, Mingming Li

Differently, in this paper, we are interested in a novel problem named Time Series Conditioned Graph Generation: given an input multivariate time series, we aim to infer a target relation graph modeling the underlying interrelationships between time series with each node corresponding to each time series.

Graph Generation Time Series +1

Discrimination-aware Network Pruning for Deep Model Compression

1 code implementation4 Jan 2020 Jing Liu, Bohan Zhuang, Zhuangwei Zhuang, Yong Guo, Junzhou Huang, Jinhui Zhu, Mingkui Tan

In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.

Face Recognition Image Classification +2

Semantic-Aware Label Placement for Augmented Reality in Street View

no code implementations15 Dec 2019 Jianqing Jia, Semir Elezovikj, Heng Fan, Shuojin Yang, Jing Liu, Wei Guo, Chiu C. Tan, Haibin Ling

Our solution encodes the constraints for placing labels in an optimization problem to obtain the final label layout, and the labels will be placed in appropriate positions to reduce the chances of overlaying important real-world objects in street view AR scenarios.

CoKE: Contextualized Knowledge Graph Embedding

3 code implementations6 Nov 2019 Quan Wang, Pingping Huang, Haifeng Wang, Songtai Dai, Wenbin Jiang, Jing Liu, Yajuan Lyu, Yong Zhu, Hua Wu

This work presents Contextualized Knowledge Graph Embedding (CoKE), a novel paradigm that takes into account such contextual nature, and learns dynamic, flexible, and fully contextualized entity and relation embeddings.

Knowledge Graph Embedding Link Prediction +1

Adaptive Context Network for Scene Parsing

no code implementations ICCV 2019 Jun Fu, Jing Liu, Yuhang Wang, Yong Li, Yongjun Bao, Jinhui Tang, Hanqing Lu

Recent works attempt to improve scene parsing performance by exploring different levels of contexts, and typically train a well-designed convolutional network to exploit useful contexts across all pixels equally.

Scene Parsing Semantic Segmentation

D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension

1 code implementation WS 2019 Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu, Haifeng Wang

In this paper, we introduce a simple system Baidu submitted for MRQA (Machine Reading for Question Answering) 2019 Shared Task that focused on generalization of machine reading comprehension (MRC) models.

Machine Reading Comprehension Multi-Task Learning +1

Flash X-ray diffraction imaging in 3D: a proposed analysis pipeline

no code implementations30 Oct 2019 Jing Liu, Stefan Engblom, Carl Nettelblad

Modern Flash X-ray diffraction Imaging (FXI) acquires diffraction signals from single biomolecules at a high repetition rate from X-ray Free Electron Lasers (XFELs), easily obtaining millions of 2D diffraction patterns from a single experiment.

Vatex Video Captioning Challenge 2020: Multi-View Features and Hybrid Reward Strategies for Video Captioning

no code implementations17 Oct 2019 Xinxin Zhu, Longteng Guo, Peng Yao, Shichen Lu, Wei Liu, Jing Liu

This report describes our solution for the VATEX Captioning Challenge 2020, which requires generating descriptions for the videos in both English and Chinese languages.

Video Captioning

Gumbel-softmax Optimization: A Simple General Framework for Combinatorial Optimization Problems on Graphs

no code implementations16 Sep 2019 Jing Liu, Fei Gao, Jiang Zhang

Many problems in real life can be converted to combinatorial optimization problems (COPs) on graphs, that is to find a best node state configuration or a network structure such that the designed objective function is optimized under some constraints.

Combinatorial Optimization

Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations

no code implementations10 Aug 2019 Bohan Zhuang, Jing Liu, Mingkui Tan, Lingqiao Liu, Ian Reid, Chunhua Shen

Furthermore, we propose a second progressive quantization scheme which gradually decreases the bit-width from high-precision to low-precision during training.

Knowledge Distillation Quantization

Aligning Linguistic Words and Visual Semantic Units for Image Captioning

1 code implementation6 Aug 2019 Longteng Guo, Jing Liu, Jinhui Tang, Jiangwei Li, Wei Luo, Hanqing Lu

Image captioning attempts to generate a sentence composed of several linguistic words, which are used to describe objects, attributes, and interactions in an image, denoted as visual semantic units in this paper.

Attribute Image Captioning +2

Quantum Fisher information matrix and multiparameter estimation

no code implementations18 Jul 2019 Jing Liu, Haidong Yuan, Xiao-Ming Lu, Xiaoguang Wang

Quantum Fisher information matrix (QFIM) is a core concept in theoretical quantum metrology due to the significant importance of quantum Cram\'{e}r-Rao bound in quantum parameter estimation.

Quantum Physics

Multi-local Collaborative AutoEncoder

no code implementations12 Jun 2019 Jielei Chu, Hongjun Wang, Jing Liu, Zhiguo Gong, Tianrui Li

In mcrRBM and mcrGRBM models, the structure and multi-local collaborative relationships of unlabeled data are integrated into their encoding procedure.

Clustering Representation Learning

A General Deep Learning Framework for Network Reconstruction and Dynamics Learning

1 code implementation30 Dec 2018 Zhang Zhang, Yi Zhao, Jing Liu, Shuo Wang, Ruyi Tao, Ruyue Xin, Jiang Zhang

We exhibit the universality of our framework on different kinds of time-series data: with the same structure, our model can be trained to accurately recover the network structure and predict future states on continuous, discrete, and binary dynamics, and outperforms competing network reconstruction methods.

Time Series Time Series Analysis

Unsupervised Feature Learning Architecture with Multi-clustering Integration RBM

no code implementations5 Dec 2018 Jielei Chu, Hongjun Wang, Jing Liu, Zhiguo Gong, Tianrui Li

In this paper, we present a novel unsupervised feature learning architecture, which consists of a multi-clustering integration module and a variant of RBM termed multi-clustering integration RBM (MIRBM).

Clustering

Supervised Classification Methods for Flash X-ray single particle diffraction Imaging

no code implementations25 Oct 2018 Jing Liu, Gijs van der Schot, Stefan Engblom

It is also straightforward to parallelize them so as to fully match the XFEL repetition rate, thereby enabling processing at site.

General Classification

Answer-focused and Position-aware Neural Question Generation

no code implementations EMNLP 2018 Xingwu Sun, Jing Liu, Yajuan Lyu, wei he, Yanjun Ma, Shi Wang

(2) The model copies the context words that are far from and irrelevant to the answer, instead of the words that are close and relevant to the answer.

Machine Reading Comprehension Position +3

Aggregated Semantic Matching for Short Text Entity Linking

no code implementations CONLL 2018 Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin, Rong pan

The task of entity linking aims to identify concepts mentioned in a text fragments and link them to a reference knowledge base.

Card Games Entity Linking +2

Dual Attention Network for Scene Segmentation

12 code implementations CVPR 2019 Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, Hanqing Lu

Specifically, we append two types of attention modules on top of traditional dilated FCN, which model the semantic interdependencies in spatial and channel dimensions respectively.

Position Segmentation +1

Neural Math Word Problem Solver with Reinforcement Learning

no code implementations COLING 2018 Danqing Huang, Jing Liu, Chin-Yew Lin, Jian Yin

Experimental results show that (1) The copy and alignment mechanism is effective to address the two issues; (2) Reinforcement learning leads to better performance than maximum likelihood on this task; (3) Our neural model is complementary to the feature-based model and their combination significantly outperforms the state-of-the-art results.

Feature Engineering Math +3

Adaptations of ROUGE and BLEU to Better Evaluate Machine Reading Comprehension Task

no code implementations WS 2018 An Yang, Kai Liu, Jing Liu, Yajuan Lyu, Sujian Li

Current evaluation metrics to question answering based machine reading comprehension (MRC) systems generally focus on the lexical overlap between the candidate and reference answers, such as ROUGE and BLEU.

Machine Reading Comprehension Question Answering

Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification

no code implementations ACL 2018 Yizhong Wang, Kai Liu, Jing Liu, wei he, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang

Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine.

Machine Reading Comprehension Question Answering

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

3 code implementations WS 2018 Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yu-An Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang

Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements.

Machine Reading Comprehension

A Statistical Framework for Product Description Generation

no code implementations IJCNLP 2017 Jinpeng Wang, Yutai Hou, Jing Liu, Yunbo Cao, Chin-Yew Lin

We present in this paper a statistical framework that generates accurate and fluent product description from product attributes.

Attribute Data-to-Text Generation

Stacked Deconvolutional Network for Semantic Segmentation

no code implementations16 Aug 2017 Jun Fu, Jing Liu, Yuhang Wang, Hanqing Lu

In SDN, multiple shallow deconvolutional networks, which are called as SDN units, are stacked one by one to integrate contextual information and guarantee the fine recovery of localization information.

Segmentation Semantic Segmentation

Long-term Blood Pressure Prediction with Deep Recurrent Neural Networks

2 code implementations12 May 2017 Peng Su, Xiao-Rong Ding, Yuan-Ting Zhang, Jing Liu, Fen Miao, Ni Zhao

Existing methods for arterial blood pressure (BP) estimation directly map the input physiological signals to output BP values without explicitly modeling the underlying temporal dependencies in BP dynamics.

Blood pressure estimation Electrocardiography (ECG) +1

Assessing Uncertainties in X-ray Single-particle Three-dimensional reconstructions

no code implementations2 Jan 2017 Stefan Engblom, Carl Nettelblad, Jing Liu

These two-dimensional diffraction patterns can be practically reconstructed and retrieved down to a resolution of a few \angstrom.

Cannot find the paper you are looking for? You can Submit a new open access paper.