Search Results for author: Jizhong Han

Found 38 papers, 19 papers with code

Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

no code implementations18 Jul 2024 Qiao Li, Xiaomeng Fu, Xi Wang, Jin Liu, Xingyu Gao, Jiao Dai, Jizhong Han

Therefore, in order to judge whether a specific image is utilized as a member of a model's training set, Membership Inference Attack (MIA) is proposed to serve as a tool for privacy protection.

Inference Attack Membership Inference Attack +1

Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens

1 code implementation19 Jun 2024 Xikang Yang, Xuehai Tang, Fuqing Zhu, Jizhong Han, Songlin Hu

Vision-language models (VLMs) seamlessly integrate visual and textual data to perform tasks such as image classification, caption generation, and visual question answering.

Caption Generation Image Classification +2

Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM

1 code implementation9 May 2024 Xikang Yang, Xuehai Tang, Songlin Hu, Jizhong Han

CoA is a semantic-driven contextual multi-turn attack method that adaptively adjusts the attack policy through contextual feedback and semantic relevance during multi-turn of dialogue with a large model, resulting in the model producing unreasonable or harmful content.

Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection

1 code implementation30 Apr 2024 Cai Yu, Shan Jia, Xiaomeng Fu, Jin Liu, Jiahe Tian, Jiao Dai, Xi Wang, Siwei Lyu, Jizhong Han

With the rising prevalence of deepfakes, there is a growing interest in developing generalizable detection methods for various types of deepfakes.

Audio-Visual Synchronization DeepFake Detection +1

Model Will Tell: Training Membership Inference for Diffusion Models

no code implementations13 Mar 2024 Xiaomeng Fu, Xi Wang, Qiao Li, Jin Liu, Jiao Dai, Jizhong Han

In this paper, we explore a novel perspective for the TMI task by leveraging the intrinsic generative priors within the diffusion model.

Binary Classification

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

no code implementations CVPR 2024 Runze He, Shaofei Huang, Xuecheng Nie, Tianrui Hui, Luoqi Liu, Jiao Dai, Jizhong Han, Guanbin Li, Si Liu

In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt.

3D scene Editing

Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative Grounding

no code implementations2 Nov 2023 Tianrui Hui, Zihan Ding, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Jiao Dai, Jizhong Han, Si Liu

Panoptic narrative grounding (PNG) aims to segment things and stuff objects in an image described by noun phrases of a narrative caption.

Decoder Object

OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions

no code implementations28 Sep 2023 Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

Other works construct one-to-one mapping between audio signal and head motion sequences, introducing ambiguity correspondences into the mapping since people can behave differently in head motions when speaking the same content.

Talking Head Generation Video Generation

Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation

no code implementations18 Sep 2023 Shaofei Huang, Han Li, Yuqing Wang, Hongji Zhu, Jiao Dai, Jizhong Han, Wenge Rong, Si Liu

Explicit object-level semantic correspondence between audio and visual modalities is established by gathering object information from visual features with predefined audio queries.

Object Semantic correspondence

MFR-Net: Multi-faceted Responsive Listening Head Generation via Denoising Diffusion Model

no code implementations31 Aug 2023 Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

Responsive listening head generation is an important task that aims to model face-to-face communication scenarios by generating a listener head video given a speaker video and a listener head image.

Denoising Diversity

FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions

no code implementations31 Mar 2023 Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

Specifically, the head pose prediction module is designed to generate head pose sequences from the source face and driving audio.

Diversity Pose Prediction +2

OPT: One-shot Pose-Controllable Talking Head Generation

no code implementations16 Feb 2023 Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

To solve the identity mismatch problem and achieve high-quality free pose control, we present One-shot Pose-controllable Talking head generation network (OPT).

Disentanglement Talking Head Generation

Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection

1 code implementation CVPR 2023 Shaofei Huang, Zhenwei Shen, Zehao Huang, Zi-han Ding, Jiao Dai, Jizhong Han, Naiyan Wang, Si Liu

An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes.

3D Lane Detection

Bridging Search Region Interaction With Template for RGB-T Tracking

1 code implementation CVPR 2023 Tianrui Hui, Zizheng Xun, Fengguang Peng, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Jiao Dai, Jizhong Han, Si Liu

To alleviate these limitations, we propose a novel Template-Bridged Search region Interaction (TBSI) module which exploits templates as the medium to bridge the cross-modal interaction between RGB and TIR search regions by gathering and distributing target-relevant object and environment contexts.

Rgb-T Tracking Template Matching

RaP: Redundancy-aware Video-language Pre-training for Text-Video Retrieval

1 code implementation13 Oct 2022 Xing Wu, Chaochen Gao, Zijia Lin, Zhongyuan Wang, Jizhong Han, Songlin Hu

Sparse sampling is also likely to miss important frames corresponding to some text portions, resulting in textual redundancy.

Contrastive Learning Retrieval +1

InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

2 code implementations8 Oct 2022 Xing Wu, Chaochen Gao, Zijia Lin, Jizhong Han, Zhongyuan Wang, Songlin Hu

Contrastive learning has been extensively studied in sentence embedding learning, which assumes that the embeddings of different views of the same sentence are closer.

Contrastive Learning Language Modelling +5

ESimCSE: Enhanced Sample Building Method for Contrastive Learning of Unsupervised Sentence Embedding

2 code implementations COLING 2022 Xing Wu, Chaochen Gao, Liangjun Zang, Jizhong Han, Zhongyuan Wang, Songlin Hu

Unsup-SimCSE takes dropout as a minimal data augmentation method, and passes the same input sentence to a pre-trained Transformer encoder (with dropout turned on) twice to obtain the two corresponding embeddings to build a positive pair.

Contrastive Learning Data Augmentation +5

Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation

no code implementations CVPR 2021 Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang

Though 3D convolutions are amenable to recognizing which actor is performing the queried actions, it also inevitably introduces misaligned spatial information from adjacent frames, which confuses features of the target frame and yields inaccurate segmentation.

Decoder feature selection +1

LI-Net: Large-Pose Identity-Preserving Face Reenactment Network

no code implementations7 Apr 2021 Jin Liu, Peng Chen, Tao Liang, Zhaoxing Li, Cai Yu, Shuqiao Zou, Jiao Dai, Jizhong Han

Face reenactment is a challenging task, as it is difficult to maintain accurate expression, pose and identity simultaneously.

Face Reenactment

ORDNet: Capturing Omni-Range Dependencies for Scene Parsing

no code implementations11 Jan 2021 Shaofei Huang, Si Liu, Tianrui Hui, Jizhong Han, Bo Li, Jiashi Feng, Shuicheng Yan

Our ORDNet is able to extract more comprehensive context information and well adapt to complex spatial variance in scene images.

Scene Parsing

Referring Image Segmentation via Cross-Modal Progressive Comprehension

1 code implementation CVPR 2020 Shaofei Huang, Tianrui Hui, Si Liu, Guanbin Li, Yunchao Wei, Jizhong Han, Luoqi Liu, Bo Li

In addition to the CMPC module, we further leverage a simple yet effective TGFE module to integrate the reasoned multimodal features from different levels with the guidance of textual information.

Attribute Image Segmentation +2

Hierarchical Interaction Networks with Rethinking Mechanism for Document-level Sentiment Analysis

1 code implementation16 Jul 2020 Lingwei Wei, Dou Hu, Wei Zhou, Xuehai Tang, Xiaodan Zhang, Xin Wang, Jizhong Han, Songlin Hu

Furthermore, we design a Sentiment-based Rethinking mechanism (SR) by refining the HIN with sentiment label information to learn a more sentiment-aware document representation.

Sentiment Analysis Sentiment Classification +1

Data Augmentation for Copy-Mechanism in Dialogue State Tracking

no code implementations22 Feb 2020 Xiaohui Song, Liangjun Zang, Yipeng Su, Xing Wu, Jizhong Han, Songlin Hu

While several state-of-the-art approaches to dialogue state tracking (DST) have shown promising performances on several benchmarks, there is still a significant performance gap between seen slot values (i. e., values that occur in both training set and test set) and unseen ones (values that occur in training set but not in test set).

Data Augmentation Dialogue State Tracking

Beyond Statistical Relations: Integrating Knowledge Relations into Style Correlations for Multi-Label Music Style Classification

1 code implementation9 Nov 2019 Qianwen Ma, Chunyuan Yuan, Wei Zhou, Jizhong Han, Songlin Hu

Based on the two types of relations, we use a graph convolutional network to learn the deep correlations between styles automatically.

General Classification

Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots

1 code implementation IJCNLP 2019 Chunyuan Yuan, Wei Zhou, Mingming Li, Shangwen Lv, Fuqing Zhu, Jizhong Han, Songlin Hu

Existing works mainly focus on matching candidate responses with every context utterance on multiple levels of granularity, which ignore the side effect of using excessive context information.

Conversational Response Selection Retrieval

Learning review representations from user and product level information for spam detection

no code implementations10 Sep 2019 Chunyuan Yuan, Wei Zhou, Qianwen Ma, Shangwen Lv, Jizhong Han, Songlin Hu

Then, we use orthogonal decomposition and fusion attention to learn a user, review, and product representation from the review information.

Spam detection

Jointly embedding the local and global relations of heterogeneous graph for rumor detection

1 code implementation10 Sep 2019 Chunyuan Yuan, Qianwen Ma, Wei Zhou, Jizhong Han, Songlin Hu

The development of social media has revolutionized the way people communicate, share information and make decisions, but it also provides an ideal platform for publishing and spreading rumors.

TransSent: Towards Generation of Structured Sentences with Discourse Marker

no code implementations5 Sep 2019 Xing Wu, Dongjun Wei, Liangjun Zang, Jizhong Han, Songlin Hu

Automatic and human evaluation results show that TransSent can generate structured sentences with high quality, and has certain scalability in different tasks.

Dialogue Generation Sentence

ESA: Entity Summarization with Attention

2 code implementations25 May 2019 Dongjun Wei, Yaxin Liu, Fuqing Zhu, Liangjun Zang, Wei Zhou, Jizhong Han, Songlin Hu

Entity summarization aims at creating brief but informative descriptions of entities from knowledge graphs.

Clustering Knowledge Graphs

Imbalanced Sentiment Classification Enhanced with Discourse Marker

no code implementations28 Mar 2019 Tao Zhang, Xing Wu, Meng Lin, Jizhong Han, Songlin Hu

Imbalanced data commonly exists in real world, espacially in sentiment-related corpus, making it difficult to train a classifier to distinguish latent sentiment in text data.

Classification Data Augmentation +4

AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks

no code implementations21 Jan 2019 Jinrong Guo, Wantao Liu, Wang Wang, Qu Lu, Songlin Hu, Jizhong Han, Ruixuan Li

Typically, Ultra-deep neural network(UDNN) tends to yield high-quality model, but its training process is usually resource intensive and time-consuming.

Conditional BERT Contextual Augmentation

5 code implementations17 Dec 2018 Xing Wu, Shangwen Lv, Liangjun Zang, Jizhong Han, Songlin Hu

BERT demonstrates that a deep bidirectional language model is more powerful than either an unidirectional language model or the shallow concatenation of a forward and backward model.

Data Augmentation Language Modelling +1

Cross-domain Human Parsing via Adversarial Feature and Label Adaptation

no code implementations4 Jan 2018 Si Liu, Yao Sun, Defa Zhu, Guanghui Ren, Yu Chen, Jiashi Feng, Jizhong Han

Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences.

Human Parsing

Cannot find the paper you are looking for? You can Submit a new open access paper.