Search Results for author: Zhidong Deng

Found 17 papers, 6 papers with code

Improving Detection in Aerial Images by Capturing Inter-Object Relationships

no code implementations5 Apr 2024 Botao Ren, Botian Xu, Yifan Pu, Jingyi Wang, Zhidong Deng

In many image domains, the spatial distribution of objects in a scene exhibits meaningful patterns governed by their semantic relationships.

DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models

no code implementations15 Dec 2023 Yifeng Ma, Shiwei Zhang, Jiayu Wang, Xiang Wang, Yingya Zhang, Zhidong Deng

In this work, we propose a DreamTalk framework to fulfill this gap, which employs meticulous design to unlock the potential of diffusion models in generating expressive talking heads.

Denoising Talking Head Generation

Feedback RoI Features Improve Aerial Object Detection

no code implementations28 Nov 2023 Botao Ren, Botian Xu, Tengyu Liu, Jingyi Wang, Zhidong Deng

Neuroscience studies have shown that the human visual system utilizes high-level feedback information to guide lower-level perception, enabling adaptation to signals of different characteristics.

feature selection Object +2

3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment

1 code implementation ICCV 2023 Ziyu Zhu, Xiaojian Ma, Yixin Chen, Zhidong Deng, Siyuan Huang, Qing Li

3D vision-language grounding (3D-VL) is an emerging field that aims to connect the 3D physical world with natural language, which is crucial for achieving embodied intelligence.

Dense Captioning Question Answering +3

Improving Scene Graph Generation with Superpixel-Based Interaction Learning

no code implementations4 Aug 2023 Jingyi Wang, Can Zhang, Jinfa Huang, Botao Ren, Zhidong Deng

(ii) We explore intra-entity and cross-entity interactions among the superpixels to enrich fine-grained interactions between entities at an earlier stage.

Graph Generation Scene Graph Generation +1

Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs

no code implementations19 May 2023 IokTong Lei, Zhidong Deng

As a way of communicating with users and any LLMs like GPT or PaLM2, prompting becomes an increasingly important research topic for better utilization of LLMs.

Arithmetic Reasoning GSM8K +4

Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs

1 code implementation15 May 2023 Jingyi Wang, Jinfa Huang, Can Zhang, Zhidong Deng

In this paper, we propose a Time-variant Relation-aware TRansformer (TR$^2$), which aims to model the temporal change of relations in dynamic scene graphs.

Relation Scene Graph Generation +1

TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles

no code implementations1 Apr 2023 Yifeng Ma, Suzhen Wang, Yu Ding, Bowen Ma, Tangjie Lv, Changjie Fan, Zhipeng Hu, Zhidong Deng, Xin Yu

In this work, we propose an expression-controllable one-shot talking head method, dubbed TalkCLIP, where the expression in a speech is specified by the natural language.

2D Semantic Segmentation task 3 (25 classes) Talking Head Generation

StyleTalk: One-shot Talking Head Generation with Controllable Speaking Styles

1 code implementation3 Jan 2023 Yifeng Ma, Suzhen Wang, Zhipeng Hu, Changjie Fan, Tangjie Lv, Yu Ding, Zhidong Deng, Xin Yu

In a nutshell, we aim to attain a speaking style from an arbitrary reference speaking video and then drive the one-shot portrait to speak with the reference speaking style and another piece of audio.

Talking Face Generation Talking Head Generation

DuMLP-Pin: A Dual-MLP-dot-product Permutation-invariant Network for Set Feature Extraction

1 code implementation8 Mar 2022 Jiajun Fei, Ziyu Zhu, Wenlei Liu, Zhidong Deng, Mingyang Li, Huanjun Deng, Shuo Zhang

We strictly prove that any permutation-invariant function implemented by DuMLP-Pin can be decomposed into two or more permutation-equivariant ones in a dot-product way as the cardinality of the given input set is greater than a threshold.

Attribute Point Cloud Classification

Phase Space Reconstruction Network for Lane Intrusion Action Recognition

no code implementations22 Feb 2021 Ruiwen Zhang, Zhidong Deng, Hongsen Lin, Hongchao Lu

In a complex road traffic scene, illegal lane intrusion of pedestrians or cyclists constitutes one of the main safety challenges in autonomous driving application.

Action Recognition Autonomous Driving +5

A Deep Graph Wavelet Convolutional Neural Network for Semi-supervised Node Classification

1 code implementation19 Feb 2021 Jingyi Wang, Zhidong Deng

Graph convolutional neural network provides good solutions for node classification and other tasks with non-Euclidean data.

General Classification Node Classification

DETR for Crowd Pedestrian Detection

1 code implementation12 Dec 2020 Matthieu Lin, Chuming Li, Xingyuan Bu, Ming Sun, Chen Lin, Junjie Yan, Wanli Ouyang, Zhidong Deng

Furthermore, the bipartite match of ED harms the training efficiency due to the large ground truth number in crowd scenes.

Pedestrian Detection

Fast Object Detection in Compressed Video

no code implementations ICCV 2019 Shiyao Wang, Hongchao Lu, Zhidong Deng

To our best knowledge, the MMNet is the first work that investigates a deep convolutional detector on compressed videos.

Object object-detection +2

Recent progress in semantic image segmentation

no code implementations20 Sep 2018 Xiaolong Liu, Zhidong Deng, Yuhan Yang

In this paper, we divide semantic image segmentation methods into two categories: traditional and recent DNN method.

Image Segmentation Segmentation +1

Fully Motion-Aware Network for Video Object Detection

no code implementations ECCV 2018 Shiyao Wang, Yucong Zhou, Junjie Yan, Zhidong Deng

Video objection detection is challenging in the presence of appearance deterioration in certain video frames.

Object object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.