Search Results for author: Jian Guan

Found 51 papers, 32 papers with code

Persona-Guided Planning for Controlling the Protagonist’s Persona in Story Generation

1 code implementation NAACL 2022 Zhexin Zhang, Jiaxin Wen, Jian Guan, Minlie Huang

In this paper, we aim to control the protagonist’s persona in story generation, i. e., generating a story from a leading context and a persona description, where the protagonist should exhibit the specified personality through a coherent event sequence.

Sentence Story Generation

PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models

1 code implementation4 Mar 2025 Xueliang Zhao, Wei Wu, Jian Guan, Lingpeng Kong

The ability of large language models to solve complex mathematical problems has progressed significantly, particularly for tasks requiring advanced reasoning.

GSM8K Math +1

Theoretical Benefit and Limitation of Diffusion Language Model

no code implementations13 Feb 2025 Guhao Feng, Yihan Geng, Jian Guan, Wei Wu, LiWei Wang, Di He

In this paper, we present a rigorous theoretical analysis of a widely used type of diffusion language model, the Masked Diffusion Model (MDM), and find that its effectiveness heavily depends on the target evaluation metric.

Language Modeling Language Modelling +2

Human Decision-making is Susceptible to AI-driven Manipulation

no code implementations11 Feb 2025 Sahand Sabour, June M. Liu, Siyang Liu, Chris Z. Yao, Shiyao Cui, Xuanming Zhang, Wen Zhang, Yaru Cao, Advait Bhat, Jian Guan, Wei Wu, Rada Mihalcea, Hongning Wang, Tim Althoff, Tatia M. C. Lee, Minlie Huang

Through a randomized controlled trial with 233 participants, we examined human susceptibility to such manipulation in financial (e. g., purchases) and emotional (e. g., conflict resolution) decision-making contexts.

Decision Making

Fusion of Millimeter-wave Radar and Pulse Oximeter Data for Low-burden Diagnosis of Obstructive Sleep Apnea-Hypopnea Syndrome

no code implementations25 Jan 2025 Wei Wang, Zhaoxi Chen, Wenyu Zhang, Zetao Wang, Xiang Zhao, Chenyang Li, Jian Guan, Shankai Yin, Gang Li

Objective: The aim of the study is to develop a novel method for improved diagnosis of obstructive sleep apnea-hypopnea syndrome (OSAHS) in clinical or home settings, with the focus on achieving diagnostic performance comparable to the gold-standard polysomnography (PSG) with significantly reduced monitoring burden.

Diagnostic Sleep Staging +1

Spectral-Temporal Fusion Representation for Person-in-Bed Detection

no code implementations27 Dec 2024 Xuefeng Yang, Shiheng Zhang, Jian Guan, Feiyang Xiao, Wei Lu, Qiaoxi Zhu

This study is based on the ICASSP 2025 Signal Processing Grand Challenge's Accelerometer-Based Person-in-Bed Detection Challenge, which aims to determine bed occupancy using accelerometer signals.

Data Augmentation

Attacking Voice Anonymization Systems with Augmented Feature and Speaker Identity Difference

no code implementations26 Dec 2024 Yanzhe Zhang, Zhonghao Bi, Feiyang Xiao, Xuefeng Yang, Qiaoxi Zhu, Jian Guan

This study focuses on the First VoicePrivacy Attacker Challenge within the ICASSP 2025 Signal Processing Grand Challenge, which aims to develop speaker verification systems capable of determining whether two anonymized speech signals are from the same speaker.

Data Augmentation Speaker Verification

Graph-Enhanced Dual-Stream Feature Fusion with Pre-Trained Model for Acoustic Traffic Monitoring

no code implementations26 Dec 2024 Shitong Fan, Feiyang Xiao, Wenbo Wang, Shuhan Qi, Qiaoxi Zhu, Wenwu Wang, Jian Guan

We propose a graph-enhanced dual-stream feature fusion strategy which consists of a vehicle type feature extraction (VTFE) branch, a vehicle direction feature extraction (VDFE) branch, and a frame-level feature fusion module to combine the type and direction feature for enhanced performance.

Graph Attention Sound Source Localization

Band Prompting Aided SAR and Multi-Spectral Data Fusion Framework for Local Climate Zone Classification

no code implementations24 Dec 2024 Haiyan Lan, Shujun Li, Mingjie Xie, Xuanjia Zhao, Hongning Liu, Pengming Feng, Dongli Xu, Guangjun He, Jian Guan

In this paper, a novel band prompting aided data fusion framework is proposed for LCZ classification, namely BP-LCZ, which utilizes textual prompts associated with band groups to guide the model in learning the physical attributes of different bands and semantics of various categories inherent in SAR and multi-spectral data to augment the fused feature, thus enhancing LCZ classification performance.

Classification

Towards a Comprehensive Benchmark for Pathological Lymph Node Metastasis in Breast Cancer Sections

1 code implementation16 Nov 2024 Xitong Ling, Yuanyuan Lei, Jiawen Li, Junru Cheng, Wenting Huang, Tian Guan, Jian Guan, Yonghong He

Advances in optical microscopy scanning have significantly contributed to computational pathology (CPath) by converting traditional histopathological slides into whole slide images (WSIs).

Benchmarking Diagnostic +2

Independent Feature Enhanced Crossmodal Fusion for Match-Mismatch Classification of Speech Stimulus and EEG Response

no code implementations19 Oct 2024 Shitong Fan, Wenbo Wang, Feiyang Xiao, Shiheng Zhang, Qiaoxi Zhu, Jian Guan

Specifically, our IFE-CF contains a crossmodal encoder to encode the speech stimulus and the EEG response with a two-branch structure connected via crossmodal attention mechanism in the encoding process, a multi-channel fusion module to fuse features of two modalities by aggregating the interaction feature obtained from the crossmodal encoder and the independent feature obtained from the speech stimulus and EEG response, and a predictor to give the matching result.

EEG Eeg Decoding

Detection of Sleep Apnea-Hypopnea Events Using Millimeter-wave Radar and Pulse Oximeter

no code implementations28 Sep 2024 Wei Wang, Chenyang Li, Zhaoxi Chen, Wenyu Zhang, Zetao Wang, Xi Guo, Jian Guan, Gang Li

Obstructive Sleep Apnea-Hypopnea Syndrome (OSAHS) is a sleep-related breathing disorder associated with significant morbidity and mortality worldwide.

Temporal Localization

Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning

2 code implementations19 Sep 2024 Jiaxin Wen, Jian Guan, Hongning Wang, Wei Wu, Minlie Huang

To train CodePlan, we construct a large-scale dataset of 2M examples that integrate code-form plans with standard prompt-response pairs from existing corpora.

Form Instruction Following +1

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules

no code implementations9 Jul 2024 Zhuocheng Gong, Ang Lv, Jian Guan, Junxi Yan, Wei Wu, Huishuai Zhang, Minlie Huang, Dongyan Zhao, Rui Yan

More interestingly, with a fixed parameter budget, MoM-large enables an over 38% increase in depth for computation graphs compared to GPT-2-large, resulting in absolute gains of 1. 4 on GLUE and 1 on XSUM.

From the Least to the Most: Building a Plug-and-Play Visual Reasoner via Data Synthesis

2 code implementations28 Jun 2024 Chuanqi Cheng, Jian Guan, Wei Wu, Rui Yan

To overcome the challenge, we first introduce a least-to-most visual reasoning paradigm, which interleaves steps of decomposing a question into sub-questions and invoking external tools for resolving sub-questions.

Visual Question Answering (VQA) Visual Reasoning

FastDrag: Manipulate Anything in One Step

1 code implementation24 May 2024 Xuanjia Zhao, Jian Guan, Congyi Fan, Dongli Xu, Youtian Lin, Haiwei Pan, Pengming Feng

Drag-based image editing using generative models provides precise control over image contents, enabling users to manipulate anything in an image with a few clicks.

SISP: A Benchmark Dataset for Fine-grained Ship Instance Segmentation in Panchromatic Satellite Images

1 code implementation6 Feb 2024 Pengming Feng, Mingjie Xie, Hongning Liu, Xuanjia Zhao, Guangjun He, Xueliang Zhang, Jian Guan

To this end, we propose a benchmark dataset for fine-grained Ship Instance Segmentation in Panchromatic satellite images, namely SISP, which contains 56, 693 well-annotated ship instances with four fine-grained categories across 10, 000 sliced images, and all the images are collected from SuperView-1 satellite with the resolution of 0. 5m.

Diversity Instance Segmentation +2

AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback

1 code implementation2 Feb 2024 Jian Guan, Wei Wu, Zujie Wen, Peng Xu, Hongning Wang, Minlie Huang

We present AMOR, an agent framework based on open-source LLMs, which reasons with external knowledge bases and adapts to specific domains through human supervision to the reasoning process.

Language Models Hallucinate, but May Excel at Fact Verification

1 code implementation23 Oct 2023 Jian Guan, Jesse Dodge, David Wadden, Minlie Huang, Hao Peng

Recent progress in natural language processing (NLP) owes much to remarkable advances in large language models (LLMs).

Fact Verification Hallucination

Hierarchical Metadata Information Constrained Self-Supervised Learning for Anomalous Sound Detection Under Domain Shift

no code implementations14 Sep 2023 Haiyan Lan, Qiaoxi Zhu, Jian Guan, Yuming Wei, Wenwu Wang

Self-supervised learning methods have achieved promising performance for anomalous sound detection (ASD) under domain shift, where the type of domain shift is considered in feature learning by incorporating section IDs.

Attribute Self-Supervised Learning +1

Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation

1 code implementation4 Jul 2023 Jian Guan, Minlie Huang

Despite the huge progress in myriad generation tasks, pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts with maximization-based decoding algorithms for open-ended generation.

Attribute Sentence

Time-weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection

1 code implementation5 May 2023 Jian Guan, Youde Liu, Qiaoxi Zhu, Tieran Zheng, Jiqing Han, Wenwu Wang

This paper presents Time-Weighted Frequency Domain Representation (TWFR) with the GMM method (TWFR-GMM) for anomalous sound detection.

Re$^3$Dial: Retrieve, Reorganize and Rescale Dialogue Corpus for Long-Turn Open-Domain Dialogue Pre-training

1 code implementation4 May 2023 Jiaxin Wen, Hao Zhou, Jian Guan, Minlie Huang

However, the pre-trained dialogue model's ability to utilize long-range context is limited due to the scarcity of long-turn dialogue sessions.

EARL: An Elliptical Distribution aided Adaptive Rotation Label Assignment for Oriented Object Detection in Remote Sensing Images

1 code implementation14 Jan 2023 Jian Guan, Mingjie Xie, Youtian Lin, Guangjun He, Pengming Feng

In addition, a dynamic elliptical distribution aided sampling (DED) strategy is proposed to make the sample distribution more flexible to fit the shapes and orientations of targets, and filter out low-quality samples.

object-detection Object Detection +1

A Benchmark for Understanding and Generating Dialogue between Characters in Stories

no code implementations18 Sep 2022 Jianzhu Yao, Ziqi Liu, Jian Guan, Minlie Huang

We build a new dataset DialStory, which consists of 105k Chinese stories with a large amount of dialogue weaved into the plots to support the evaluation.

Dialogue Generation Speaker Recognition

StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing

1 code implementation29 Aug 2022 Xuekai Zhu, Jian Guan, Minlie Huang, Juan Liu

Moreover, to enhance content preservation, we design a mask-and-fill framework to explicitly fuse style-specific keywords of source texts into generation.

Sentence Style Transfer +1

Generating Coherent Narratives by Learning Dynamic and Discrete Entity States with a Contrastive Framework

1 code implementation8 Aug 2022 Jian Guan, Zhenyu Yang, Rongsheng Zhang, Zhipeng Hu, Minlie Huang

Despite advances in generating fluent texts, existing pretraining models tend to attach incoherent event sequences to involved entities when generating narratives such as stories and news.

Decoder Sentence

Local Information Assisted Attention-free Decoder for Audio Captioning

1 code implementation10 Jan 2022 Feiyang Xiao, Jian Guan, Haiyan Lan, Qiaoxi Zhu, Wenwu Wang

Although this method effectively captures global information within audio data via the self-attention mechanism, it may ignore the event with short time duration, due to its limitation in capturing local information in an audio signal, leading to inaccurate prediction of captions.

Audio captioning Caption Generation +1

LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation

2 code implementations30 Aug 2021 Jian Guan, Zhuoer Feng, Yamei Chen, Ruilin He, Xiaoxi Mao, Changjie Fan, Minlie Huang

Therefore, we propose a story-centric benchmark named LOT for evaluating Chinese long text modeling, which aggregates two understanding tasks and two generation tasks.

Decoder Text Infilling

CPM-2: Large-scale Cost-effective Pre-trained Language Models

2 code implementations20 Jun 2021 Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan YAO, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun

We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.

Decoder

OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics

1 code implementation ACL 2021 Jian Guan, Zhexin Zhang, Zhuoer Feng, Zitao Liu, Wenbiao Ding, Xiaoxi Mao, Changjie Fan, Minlie Huang

Automatic metrics are essential for developing natural language generation (NLG) models, particularly for open-ended language generation tasks such as story generation.

Story Generation

Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence

1 code implementation ACL 2021 Jian Guan, Xiaoxi Mao, Changjie Fan, Zitao Liu, Wenbiao Ding, Minlie Huang

Generating long and coherent text is an important but challenging task, particularly for open-ended language generation tasks such as story generation.

Decoder Semantic Similarity +3

Stylized Story Generation with Style-Guided Planning

no code implementations Findings (ACL) 2021 Xiangzhe Kong, Jialiang Huang, Ziquan Tung, Jian Guan, Minlie Huang

Current storytelling systems focus more ongenerating stories with coherent plots regard-less of the narration style, which is impor-tant for controllable text generation.

Story Generation

Low-dimensional Denoising Embedding Transformer for ECG Classification

no code implementations31 Mar 2021 Jian Guan, Wenbo Wang, Pengming Feng, Xinxin Wang, Wenwu Wang

However, the high-dimensional embedding obtained via 1-D convolution and positional encoding can lead to the loss of the signal's own temporal information and a large amount of training parameters.

Classification Denoising +2

Time-domain Speech Enhancement with Generative Adversarial Learning

1 code implementation30 Mar 2021 Feiyang Xiao, Jian Guan, Qiuqiang Kong, Wenwu Wang

Speech enhancement aims to obtain speech signals with high intelligibility and quality from noisy speech.

Generative Adversarial Network Speech Enhancement

A Text GAN for Language Generation with Non-Autoregressive Generator

no code implementations1 Jan 2021 Fei Huang, Jian Guan, Pei Ke, Qihan Guo, Xiaoyan Zhu, Minlie Huang

Despite the great success of Generative Adversarial Networks (GANs) in generating high-quality images, GANs for text generation still face two major challenges: first, most text GANs are unstable in training mainly due to ineffective optimization of the generator, and they heavily rely on maximum likelihood pretraining; second, most text GANs adopt autoregressive generators without latent variables, which largely limits the ability to learn latent representations for natural language text.

Decipherment Representation Learning +2

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

1 code implementation EMNLP 2020 Jian Guan, Minlie Huang

Experiments on two story datasets demonstrate that UNION is a reliable measure for evaluating the quality of generated stories, which correlates better with human judgments and is more generalizable than existing state-of-the-art metrics.

Story Generation

CoTK: An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation

1 code implementation3 Feb 2020 Fei Huang, Dazhen Wan, Zhihong Shao, Pei Ke, Jian Guan, Yilin Niu, Xiaoyan Zhu, Minlie Huang

In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions.

Text Generation

A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation

1 code implementation TACL 2020 Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang

To further capture the causal and temporal dependencies between the sentences in a reasonable story, we employ multi-task learning which combines a discriminative objective to distinguish true and fake stories during fine-tuning.

Multi-Task Learning Story Generation

A Practical Solution for SAR Despeckling With Adversarial Learning Generated Speckled-to-Speckled Images

no code implementations13 Dec 2019 Ye Yuan, Jian Guan, Pengming Feng, Yanxia Wu

In this letter, we aim to address a synthetic aperture radar (SAR) despeckling problem with the necessity of neither clean (speckle-free) SAR images nor independent speckled image pairs from the same scene, and a practical solution for SAR despeckling (PSD) is proposed.

IENet: Interacting Embranchment One Stage Anchor Free Detector for Orientation Aerial Object Detection

no code implementations2 Dec 2019 Youtian Lin, Pengming Feng, Jian Guan, Wenwu Wang, Jonathon Chambers

First, a novel geometric transformation is employed to better represent the oriented object in angle prediction, then a branch interactive module with a self-attention mechanism is developed to fuse features from classification and box regression branches.

Object object-detection +4

Story Ending Generation with Incremental Encoding and Commonsense Knowledge

1 code implementation30 Aug 2018 Jian Guan, Yansen Wang, Minlie Huang

This task requires not only to understand the context clues which play an important role in planning the plot but also to handle implicit knowledge to make a reasonable, coherent story.

Image-guided Story Ending Generation

Generating Informative Responses with Controlled Sentence Function

1 code implementation ACL 2018 Pei Ke, Jian Guan, Minlie Huang, Xiaoyan Zhu

Experiments show that our model outperforms state-of-the-art baselines, and it has the ability to generate responses with both controlled sentence function and informative content.

Position Sentence +2

Cannot find the paper you are looking for? You can Submit a new open access paper.