Speaker Clustering in Textual Dialogue with Pairwise Utterance Relation and Cross-corpus Dialogue Act Supervision

no code implementations COLING 2022 Zhihua Su, Qiang Zhou

We propose a speaker clustering model for textual dialogues, which groups the utterances of a multi-party dialogue without speaker annotations, so that the actual speakers are identical inside each cluster.

Clustering Cross-corpus +5

眼动记录与主旨结构标注的关联性分析研究(Research on the correlation between eye movement feature and thematic structure label)

no code implementations CCL 2020 Haocong Shan, Qiang Zhou

给定包含主旨概括句的汉语句群, 针对该句群的内部结构标注是基于语言学的分析结果, 而阅读句群时的眼动轨迹则蕴含着人的心理认知, 两者的信息融合和内在关联性分析是该文主要工作。该文使用基于径向基函数支持向量机和递归特征消除的分类模型, 根据标点小句片段对应的眼动指标数据预测该片段是否为包含主旨内容的关键信息, 达到了0. 76的准确率, 并通过分析关键片段上眼动数据的分布特点, 提取出对句群主旨概括信息区分度较好的眼动指标。

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

1 code implementation28 Jan 2024 Shaofeng Zhang, Jinfa Huang, Qiang Zhou, Zhibin Wang, Fan Wang, Jiebo Luo, Junchi Yan

At inference, we generate images with arbitrary expansion multiples by inputting an anchor image and its corresponding positional embeddings.

Image Outpainting

DMT: Comprehensive Distillation with Multiple Self-supervised Teachers

no code implementations19 Dec 2023 Yuang Liu, Jing Wang, Qiang Zhou, Fan Wang, Jun Wang, Wei zhang

Numerous self-supervised learning paradigms, such as contrastive learning and masked image modeling, have been proposed to acquire powerful and general representations from unlabeled data.

Contrastive Learning Model Compression +1

Language-guided Few-shot Semantic Segmentation

no code implementations23 Nov 2023 Jing Wang, Yuang Liu, Qiang Zhou, Fan Wang

Few-shot learning is a promising way for reducing the label cost in new categories adaptation with the guidance of a small, well labeled support set.

Few-Shot Semantic Segmentation Segmentation +1

InfMLLM: A Unified Framework for Visual-Language Tasks

2 code implementations12 Nov 2023 Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.

Image Captioning Instruction Following +3

PAD: A Dataset and Benchmark for Pose-agnostic Anomaly Detection

1 code implementation NeurIPS 2023 Qiang Zhou, Weize Li, Lihan Jiang, Guoliang Wang, Guyue Zhou, Shanghang Zhang, Hao Zhao

Furthermore, we provide an open-source benchmark library, including dataset and baseline methods that cover 8 anomaly detection paradigms, to facilitate future research and application in this domain.

4k Anomaly Detection

A ModelOps-based Framework for Intelligent Medical Knowledge Extraction

no code implementations4 Oct 2023 Hongxin Ding, Peinie Zou, Zhiyuan Wang, Junfeng Zhao, Yasha Wang, Qiang Zhou

Extracting medical knowledge from healthcare texts enhances downstream tasks like medical knowledge graph construction and clinical decision-making.

Decision Making graph construction +2

ES-MVSNet: Efficient Framework for End-to-end Self-supervised Multi-View Stereo

no code implementations4 Aug 2023 Qiang Zhou, Chaohui Yu, Jingliang Li, Yuang Liu, Jing Wang, Zhibin Wang

to provide additional consistency constraints, which grows GPU memory consumption and complicates the model's structure and training pipeline.

Optical Flow Estimation Semantic Segmentation

Dynamic Token-Pass Transformers for Semantic Segmentation

no code implementations3 Aug 2023 Yuang Liu, Qiang Zhou, Jing Wang, Fan Wang, Jun Wang, Wei zhang

Vision transformers (ViT) usually extract features via forwarding all the tokens in the self-attention layers from top to toe.

Segmentation Semantic Segmentation

RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension

1 code implementation3 Aug 2023 Qiang Zhou, Chaohui Yu, Shaofeng Zhang, Sitong Wu, Zhibing Wang, Fan Wang

To this end, we propose to extract features corresponding to regional objects as soft prompts for LLM, which provides a straightforward and scalable approach and eliminates the need for LLM fine-tuning.

Image Comprehension

Improved Neural Radiance Fields Using Pseudo-depth and Fusion

no code implementations27 Jul 2023 Jingliang Li, Qiang Zhou, Chaohui Yu, Zhengda Lu, Jun Xiao, Zhibin Wang, Fan Wang

To make the constructed volumes as close as possible to the surfaces of objects in the scene and the rendered depth more accurate, we propose to perform depth prediction and radiance field reconstruction simultaneously.

Depth Estimation Depth Prediction +1

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation

no code implementations26 Jul 2023 Chaohui Yu, Qiang Zhou, Jingliang Li, Zhe Zhang, Zhibin Wang, Fan Wang

To better utilize the sparse 3D points, we propose an efficient point cloud guidance loss to adaptively drive the NeRF's geometry to align with the shape of the sparse 3D points.

Text to 3D

DPF: Learning Dense Prediction Fields with Weak Supervision

1 code implementation CVPR 2023 Xiaoxue Chen, Yuhang Zheng, Yupeng Zheng, Qiang Zhou, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

We showcase the effectiveness of DPFs using two substantially different tasks: high-level semantic parsing and low-level intrinsic image decomposition.

Intrinsic Image Decomposition Scene Understanding +1

D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers

no code implementations1 Mar 2023 Qiang Zhou, Chaohui Yu, Zhibin Wang, Fan Wang

In this paper, we propose an end-to-end framework for oriented object detection, which simplifies the model pipeline and obtains superior performance.

Object object-detection +3

LMSeg: Language-guided Multi-dataset Segmentation

no code implementations27 Feb 2023 Qiang Zhou, Yuang Liu, Chaohui Yu, Jingliang Li, Zhibin Wang, Fan Wang

Instead of relabeling each dataset with the unified taxonomy, a category-guided decoding module is designed to dynamically guide predictions to each datasets taxonomy.

Image Augmentation Panoptic Segmentation +1

MimCo: Masked Image Modeling Pre-training with Contrastive Teacher

no code implementations7 Sep 2022 Qiang Zhou, Chaohui Yu, Hao Luo, Zhibin Wang, Hao Li

Specifically, MimCo takes a pre-trained contrastive learning model as the teacher model and is pre-trained with two types of learning targets: patch-level and image-level reconstruction losses.

Contrastive Learning Self-Supervised Learning

Point RCNN: An Angle-Free Framework for Rotated Object Detection

no code implementations28 May 2022 Qiang Zhou, Chaohui Yu, Zhibin Wang, Hao Li

To tackle this problem, we propose a purely angle-free framework for rotated object detection, called Point RCNN, which mainly consists of PointRPN and PointReg.

Object object-detection +1

Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar

no code implementations13 Apr 2022 Yaojie Hu, Xingjian Shi, Qiang Zhou, Lee Pike

We introduce NSEdit (neural-symbolic edit), a novel Transformer-based code repair method.

Code Repair

GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges

no code implementations14 Feb 2022 Junde Wu, Huihui Fang, Fei Li, Huazhu Fu, Fengbin Lin, Jiongcheng Li, Lexing Huang, Qinji Yu, Sifan Song, Xinxing Xu, Yanyu Xu, Wensai Wang, Lingxiao Wang, Shuai Lu, Huiqi Li, Shihua Huang, Zhichao Lu, Chubin Ou, Xifei Wei, Bingyuan Liu, Riadh Kobbi, Xiaoying Tang, Li Lin, Qiang Zhou, Qiang Hu, Hrvoje Bogunovic, José Ignacio Orlando, Xiulan Zhang, Yanwu Xu

However, although numerous algorithms are proposed based on fundus images or OCT volumes in computer-aided diagnosis, there are still few methods leveraging both of the modalities for the glaucoma assessment.

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts

1 code implementation CVPR 2022 Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, Ziwei Liu

By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.

Out-of-Distribution Generalization Self-Supervised Learning

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

1 code implementation ICCV 2021 Yijia Weng, He Wang, Qiang Zhou, Yuzhe Qin, Yueqi Duan, Qingnan Fan, Baoquan Chen, Hao Su, Leonidas J. Guibas

For the first time, we propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances as well as per-part pose tracking for articulated objects from known categories.

Pose Tracking

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

1 code implementation CVPR 2021 Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li

To alleviate the confirmation bias problem and improve the quality of pseudo annotations, we further propose a co-rectify scheme based on Instant-Teaching, denoted as Instant-Teaching$^*$.

Ranked #12 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Object object-detection +2

Object Detection Made Simpler by Eliminating Heuristic NMS

no code implementations28 Jan 2021 Qiang Zhou, Chaohui Yu, Chunhua Shen, Zhibin Wang, Hao Li

On the COCO dataset, our simple design achieves superior performance compared to both the FCOS baseline detector with NMS post-processing and the recent end-to-end NMS-free detectors.

Object object-detection +1

Modeling Heterogeneous Relations across Multiple Modes for Potential Crowd Flow Prediction

no code implementations18 Jan 2021 Qiang Zhou, Jingjing Gu, Xinjiang Lu, Fuzhen Zhuang, Yanchao Zhao, Qiuhong Wang, Xiao Zhang

Intuitively, the potential crowd flow of the new coming site can be implied by exploring the nearby sites.

Coherent optical communications using coherence-cloned Kerr soliton microcombs

no code implementations1 Jan 2021 Yong Geng, Heng Zhou, Wenwen Cui, Xinjie Han, Qiang Zhang, Boyuan Liu, Guangwei Deng, Qiang Zhou, Kun Qiu

Dissipative Kerr soliton microcomb has been recognized as a promising on-chip multi-wavelength laser source for fiber optical communications, as its comb lines possess frequency and phase stability far beyond independent lasers.

A Tree-structure Convolutional Neural Network for Temporal Features Exaction on Sensor-based Multi-resident Activity Recognition

no code implementations5 Nov 2020 Jingjing Cao, Fukang Guo, Xin Lai, Qiang Zhou, Jinshan Dai

With the propagation of sensor devices applied in smart home, activity recognition has ignited huge interest and most existing works assume that there is only one habitant.

Activity Recognition Time Series +1

Exploiting Interpretable Patterns for Flow Prediction in Dockless Bike Sharing Systems

1 code implementation13 Apr 2020 Jingjing Gu, Qiang Zhou, Jingyuan Yang, Yanchi Liu, Fuzhen Zhuang, Yanchao Zhao, Hui Xiong

Unlike the traditional dock-based systems, dockless bike-sharing systems are more convenient for users in terms of flexibility.

Clustering Management

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

no code implementations CVPR 2020 Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

We present a lightweight video motion retargeting approach TransMoMo that is capable of transferring motion of a person in a source video realistically to another video of a target person.

motion retargeting

Deep Learning-based Detection for COVID-19 from Chest CT using Weak Label

1 code implementation medRxiv 2020 Chuansheng Zheng, Xianbo Deng, Qing Fu, Qiang Zhou, Jiapei Feng, Hui Ma, Wenyu Liu, Xinggang Wang

Our weakly-supervised deep learning model can accurately predict the COVID-19 infectious probability in chest CT volumes without the need for annotating the lesions for training.

COVID-19 Diagnosis Specificity

STN-Homography: estimate homography parameters directly

no code implementations6 Jun 2019 Qiang Zhou, Xin Li

In this paper, we introduce the STN-Homography model to directly estimate the homography matrix between image pair.

Homography Estimation

Look at Boundary: A Boundary-Aware Face Alignment Algorithm

2 code implementations CVPR 2018 Wayne Wu, Chen Qian, Shuo Yang, Quan Wang, Yici Cai, Qiang Zhou

By utilising boundary information of 300-W dataset, our method achieves 3. 92% mean error with 0. 39% failure rate on COFW dataset, and 1. 25% mean error on AFLW-Full dataset.

Ranked #4 on Face Alignment on AFLW-19 (using extra training data)

Face Alignment Facial Landmark Detection

Exploiting Spin-Orbit Torque Devices as Reconfigurable Logic for Circuit Obfuscation

no code implementations8 Feb 2018 Jianlei Yang, Xueyan Wang, Qiang Zhou, Zhaohao Wang, Hai, Li, Yiran Chen, Weisheng Zhao

Circuit obfuscation is a frequently used approach to conceal logic functionalities in order to prevent reverse engineering attacks on fabricated chips.

Emerging Technologies Cryptography and Security

Distant Supervision for Entity Linking

no code implementations PACLIC 2015 Miao Fan, Qiang Zhou, Thomas Fang Zheng

In this paper, we propose a new paradigm named distantly supervised entity linking (DSEL), in the sense that the disambiguated entities that belong to a huge knowledge repository (Freebase) are automatically aligned to the corresponding descriptive webpages (Wiki pages).

Descriptive Entity Linking

Probabilistic Belief Embedding for Knowledge Base Completion

no code implementations10 May 2015 Miao Fan, Qiang Zhou, Andrew Abel, Thomas Fang Zheng, Ralph Grishman

This paper contributes a novel embedding model which measures the probability of each belief $\langle h, r, t, m\rangle$ in a large-scale knowledge repository via simultaneously learning distributed representations for entities ($h$ and $t$), relations ($r$), and the words in relation mentions ($m$).

Knowledge Base Completion Relation

Large Margin Nearest Neighbor Embedding for Knowledge Representation

no code implementations7 Apr 2015 Miao Fan, Qiang Zhou, Thomas Fang Zheng, Ralph Grishman

Traditional way of storing facts in triplets ({\it head\_entity, relation, tail\_entity}), abbreviated as ({\it h, r, t}), makes the knowledge intuitively displayed and easily acquired by mankind, but hardly computed or even reasoned by AI machines.

Link Prediction

Learning Embedding Representations for Knowledge Inference on Imperfect and Incomplete Repositories

no code implementations27 Mar 2015 Miao Fan, Qiang Zhou, Thomas Fang Zheng

This paper considers the problem of knowledge inference on large-scale imperfect repositories with incomplete coverage by means of embedding entities and relations at the first attempt.

Link Prediction

Errata: Distant Supervision for Relation Extraction with Matrix Completion

no code implementations17 Nov 2014 Miao Fan, Deli Zhao, Qiang Zhou, Zhiyuan Liu, Thomas Fang Zheng, Edward Y. Chang

The essence of distantly supervised relation extraction is that it is an incomplete multi-label classification problem with sparse and noisy features.

Classification General Classification +4

