Search Results for author: Yan Song

Found 99 papers, 52 papers with code

A Hybrid Approach to Automatic Corpus Generation for Chinese Spelling Check

1 code implementation EMNLP 2018 Dingmin Wang, Yan Song, Jing Li, Jialong Han, Haisong Zhang

Chinese spelling check (CSC) is a challenging yet meaningful task, which not only serves as a preprocessing in many natural language processing(NLP) applications, but also facilitates reading and understanding of running texts in peoples{'} daily lives.

Language Modelling Optical Character Recognition (OCR) +1

Generating Radiology Reports via Memory-driven Transformer

2 code implementations EMNLP 2020 Zhihong Chen, Yan Song, Tsung-Hui Chang, Xiang Wan

Particularly, this is the first work reporting the generation results on MIMIC-CXR to the best of our knowledge.

Text Generation

Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network

1 code implementation ACL 2019 Kun Xu, Li-Wei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, Dong Yu

Previous cross-lingual knowledge graph (KG) alignment studies rely on entity embeddings derived only from monolingual KG structural information, which may fail at matching entities that have different facts in two KGs.

Entity Embeddings Graph Attention +1

An Empirical Study on Google Research Football Multi-agent Scenarios

1 code implementation16 May 2023 Yan Song, He Jiang, Zheng Tian, Haifeng Zhang, Yingping Zhang, Jiangcheng Zhu, Zonghong Dai, Weinan Zhang, Jun Wang

Few multi-agent reinforcement learning (MARL) research on Google Research Football (GRF) focus on the 11v11 multi-agent full-game scenario and to the best of our knowledge, no open benchmark on this scenario has been released to the public.

Benchmarking Multi-agent Reinforcement Learning +1

Learning Semantic Relationship Among Instances for Image-Text Matching

1 code implementation CVPR 2023 Zheren Fu, Zhendong Mao, Yan Song, Yongdong Zhang

Image-text matching, a bridge connecting image and language, is an important task, which generally learns a holistic cross-modal embedding to achieve a high-quality semantic alignment between the two modalities.

Image Retrieval Image-text matching +8

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge

1 code implementation ACL 2020 Yuanhe Tian, Yan Song, Xiang Ao, Fei Xia, Xiaojun Quan, Tong Zhang, Yonggang Wang

Chinese word segmentation (CWS) and part-of-speech (POS) tagging are important fundamental tasks for Chinese language processing, where joint learning of them is an effective one-step solution for both tasks.

Chinese Word Segmentation Part-Of-Speech Tagging +2

Cross-modal Memory Networks for Radiology Report Generation

1 code implementation ACL 2021 Zhihong Chen, Yaling Shen, Yan Song, Xiang Wan

Medical imaging plays a significant role in clinical practice of medical diagnosis, where the text reports of the images are essential in understanding them and facilitating later treatments.

Medical Diagnosis Text Generation

Dependency-driven Relation Extraction with Attentive Graph Convolutional Networks

1 code implementation ACL 2021 Yuanhe Tian, Guimin Chen, Yan Song, Xiang Wan

Syntactic information, especially dependency trees, has been widely used by existing studies to improve relation extraction with better semantic guidance for analyzing the context information associated with the given entities.

Relation Relation Classification

A Systematic Review of Deep Learning-based Research on Radiology Report Generation

1 code implementation23 Nov 2023 Chang Liu, Yuanhe Tian, Yan Song

Specifically, we firstly cover pivotal RRG approaches based on the task-specific features of radiographs, reports, and the cross-modal relations between them, and then illustrate the benchmark datasets conventionally used for this task with evaluation metrics, subsequently analyze the performance of different approaches and finally offer our summary on the challenges and the trends in future directions.

Named Entity Recognition for Social Media Texts with Semantic Augmentation

1 code implementation EMNLP 2020 Yuyang Nie, Yuanhe Tian, Xiang Wan, Yan Song, Bo Dai

In particular, we obtain the augmented semantic information from a large-scale corpus, and propose an attentive semantic augmentation module and a gate module to encode and aggregate such information, respectively.

Chinese Named Entity Recognition named-entity-recognition +3

Iterative Document Representation Learning Towards Summarization with Polishing

1 code implementation EMNLP 2018 Xiuying Chen, Shen Gao, Chongyang Tao, Yan Song, Dongyan Zhao, Rui Yan

In this paper, we introduce Iterative Text Summarization (ITS), an iteration-based model for supervised extractive text summarization, inspired by the observation that it is often necessary for a human to read an article multiple times in order to fully understand and summarize its contents.

Extractive Text Summarization Representation Learning +1

Aspect-based Sentiment Analysis with Type-aware Graph Convolutional Networks and Layer Ensemble

2 code implementations NAACL 2021 Yuanhe Tian, Guimin Chen, Yan Song

It is popular that neural graph-based models are applied in existing aspect-based sentiment analysis (ABSA) studies for utilizing word relations through dependency parses to facilitate the task with better semantic guidance for analyzing context and aspect words.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

Summarizing Medical Conversations via Identifying Important Utterances

1 code implementation COLING 2020 Yan Song, Yuanhe Tian, Nan Wang, Fei Xia

For the particular dataset used in this study, we show that high-quality summaries can be generated by extracting two types of utterances, namely, problem statements and treatment recommendations.

ZEN 2.0: Continue Training and Adaption for N-gram Enhanced Text Encoders

1 code implementation4 May 2021 Yan Song, Tong Zhang, Yonggang Wang, Kai-Fu Lee

Pre-trained text encoders have drawn sustaining attention in natural language processing (NLP) and shown their capability in obtaining promising results in different tasks.

Joint Aspect Extraction and Sentiment Analysis with Directional Graph Convolutional Networks

1 code implementation COLING 2020 Guimin Chen, Yuanhe Tian, Yan Song

End-to-end aspect-based sentiment analysis (EASA) consists of two sub-tasks: the first extracts the aspect terms in a sentence and the second predicts the sentiment polarities for such terms.

Aspect-Based Sentiment Analysis Aspect Extraction +1

Reinforced Training Data Selection for Domain Adaptation

1 code implementation ACL 2019 Miaofeng Liu, Yan Song, Hongbin Zou, Tong Zhang

Supervised models suffer from the problem of domain shifting where distribution mismatch in the data across domains greatly affect model performance.

Dependency Parsing Domain Adaptation +3

What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues

1 code implementation IJCNLP 2019 Xintong Yu, Hongming Zhang, Yangqiu Song, Yan Song, Chang-Shui Zhang

To tackle this challenge, in this paper, we formally define the task of visual-aware pronoun coreference resolution (PCR) and introduce VisPro, a large-scale dialogue PCR dataset, to investigate whether and how the visual information can help resolve pronouns in dialogues.

coreference-resolution Natural Language Understanding

Keyphrase Generation with Cross-Document Attention

1 code implementation21 Apr 2020 Shizhe Diao, Yan Song, Tong Zhang

Keyphrase generation aims to produce a set of phrases summarizing the essentials of a given document.

Keyphrase Generation

ChiMed: A Chinese Medical Corpus for Question Answering

1 code implementation WS 2019 Yuanhe Tian, Weicheng Ma, Fei Xia, Yan Song

Question answering (QA) is a challenging task in natural language processing (NLP), especially when it is applied to specific domains.

Question Answering

Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks

1 code implementation EMNLP 2020 Yuanhe Tian, Yan Song, Fei Xia

Specifically, we build the graph from chunks (n-grams) extracted from a lexicon and apply attention over the graph, so that different word pairs from the contexts within and across chunks are weighted in the model and facilitate the supertagging accordingly.

CCG Supertagging

Knowledge-aware Pronoun Coreference Resolution

1 code implementation ACL 2019 Hongming Zhang, Yan Song, Yangqiu Song, Dong Yu

Resolving pronoun coreference requires knowledge support, especially for particular domains (e. g., medicine).

coreference-resolution Knowledge Graphs

Incorporating Context and External Knowledge for Pronoun Coreference Resolution

1 code implementation NAACL 2019 Hongming Zhang, Yan Song, Yangqiu Song

Linking pronominal expressions to the correct references requires, in many cases, better analysis of the contextual information and external knowledge.

coreference-resolution

Word Graph Guided Summarization for Radiology Findings

1 code implementation Findings (ACL) 2021 Jinpeng Hu, Jianling Li, Zhihong Chen, Yaling Shen, Yan Song, Xiang Wan, Tsung-Hui Chang

In this paper, we propose a novel method for automatic impression generation, where a word graph is constructed from the findings to record the critical words and their relations, then a Word Graph guided Summarization model (WGSum) is designed to generate impressions with the help of the word graph.

Text Summarization

Improving Constituency Parsing with Span Attention

1 code implementation Findings of the Association for Computational Linguistics 2020 Yuanhe Tian, Yan Song, Fei Xia, Tong Zhang

Constituency parsing is a fundamental and important task for natural language understanding, where a good representation of contextual information can help this task.

Constituency Parsing Natural Language Understanding +1

RevCore: Review-augmented Conversational Recommendation

1 code implementation Findings (ACL) 2021 Yu Lu, Junwei Bao, Yan Song, Zichen Ma, Shuguang Cui, Youzheng Wu, Xiaodong He

Existing conversational recommendation (CR) systems usually suffer from insufficient item information when conducted on short dialogue history and unfamiliar items.

Response Generation

Joint Chinese Word Segmentation and Part-of-speech Tagging via Multi-channel Attention of Character N-grams

1 code implementation COLING 2020 Yuanhe Tian, Yan Song, Fei Xia

However, their work on modeling such contextual features is limited to concatenating the features or their embeddings directly with the input embeddings without distinguishing whether the contextual features are important for the joint task in the specific context.

Chinese Word Segmentation Part-Of-Speech Tagging +2

Improving Biomedical Named Entity Recognition with Syntactic Information

1 code implementation BMC Bioinformatics 2020 Yuanhe Tian, Wang Shen, Yan Song, Fei Xia, Min He, Kenli Li

The experimental results on six English benchmark datasets demonstrate that auto-processed syntactic information can be a useful resource for BioNER and our method with KVMN can appropriately leverage such information to improve model performance.

named-entity-recognition Named Entity Recognition +2

WTMED at MEDIQA 2019: A Hybrid Approach to Biomedical Natural Language Inference

1 code implementation WS 2019 Zhaofeng Wu, Yan Song, Sicong Huang, Yuanhe Tian, Fei Xia

Natural language inference (NLI) is challenging, especially when it is applied to technical domains such as biomedical settings.

Natural Language Inference

Hashtag-Guided Low-Resource Tweet Classification

1 code implementation20 Feb 2023 Shizhe Diao, Sedrick Scott Keh, Liangming Pan, Zhiliang Tian, Yan Song, Tong Zhang

Social media classification tasks (e. g., tweet sentiment analysis, tweet stance detection) are challenging because social media posts are typically short, informal, and ambiguous.

Classification Sentiment Analysis +1

Exploring Unknown States with Action Balance

2 code implementations10 Mar 2020 Yan Song, Yingfeng Chen, Yujing Hu, Changjie Fan

In this paper, we focus on improving the effectiveness of finding unknown states and propose action balance exploration, which balances the frequency of selecting each action at a given state and can be treated as an extension of upper confidence bound (UCB) to deep reinforcement learning.

Montezuma's Revenge reinforcement-learning +1

Improving Relation Extraction through Syntax-induced Pre-training with Dependency Masking

1 code implementation Findings (ACL) 2022 Yuanhe Tian, Yan Song, Fei Xia

Relation extraction (RE) is an important natural language processing task that predicts the relation between two given entities, where a good understanding of the contextual information is essential to achieve an outstanding model performance.

Relation Relation Extraction +1

Enhancing Relation Extraction via Adversarial Multi-task Learning

1 code implementation LREC 2022 Han Qin, Yuanhe Tian, Yan Song

Relation extraction (RE) is a sub-field of information extraction, which aims to extract the relation between two given named entities (NEs) in a sentence and thus requires a good understanding of contextual information, especially the entities and their surrounding texts.

Multi-Task Learning named-entity-recognition +5

hyperdoc2vec: Distributed Representations of Hypertext Documents

1 code implementation ACL 2018 Jialong Han, Yan Song, Wayne Xin Zhao, Shuming Shi, Haisong Zhang

Hypertext documents, such as web pages and academic papers, are of great importance in delivering information in our daily life.

Citation Recommendation Document Embedding +1

Learning Word Embeddings with Domain Awareness

1 code implementation7 Jun 2019 Guoyin Wang, Yan Song, Yue Zhang, Dong Yu

Word embeddings are traditionally trained on a large corpus in an unsupervised setting, with no specific design for incorporating domain knowledge.

Learning Word Embeddings

Multiplex Word Embeddings for Selectional Preference Acquisition

1 code implementation IJCNLP 2019 Hongming Zhang, Jiaxin Bai, Yan Song, Kun Xu, Changlong Yu, Yangqiu Song, Wilfred Ng, Dong Yu

Therefore, in this paper, we propose a multiplex word embedding model, which can be easily extended according to various relations among words.

Word Embeddings Word Similarity

Dilation-Erosion for Single-Frame Supervised Temporal Action Localization

1 code implementation13 Dec 2022 Bin Wang, Yan Song, Fanming Wang, Yang Zhao, Xiangbo Shu, Yan Rui

To balance the annotation labor and the granularity of supervision, single-frame annotation has been introduced in temporal action localization.

Temporal Action Localization

Complementary Learning of Aspect Terms for Aspect-based Sentiment Analysis

1 code implementation LREC 2022 Han Qin, Yuanhe Tian, Fei Xia, Yan Song

Aspect-based sentiment analysis (ABSA) aims to predict the sentiment polarity towards a given aspect term in a sentence on the fine-grained level, which usually requires a good understanding of contextual information, especially appropriately distinguishing of a given aspect and its contexts, to achieve good performance.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Syntax-driven Approach for Semantic Role Labeling

1 code implementation LREC 2022 Yuanhe Tian, Han Qin, Fei Xia, Yan Song

To achieve a better performance in SRL, a model is always required to have a good understanding of the context information.

POS Semantic Role Labeling +1

Enhancing Structure-aware Encoder with Extremely Limited Data for Graph-based Dependency Parsing

1 code implementation COLING 2022 Yuanhe Tian, Yan Song, Fei Xia

Dependency parsing is an important fundamental natural language processing task which analyzes the syntactic structure of an input sentence by illustrating the syntactic relations between words.

2k Dependency Parsing +1

A Manually Annotated Chinese Corpus for Non-task-oriented Dialogue Systems

no code implementations15 May 2018 Jing Li, Yan Song, Haisong Zhang, Shuming Shi

This paper presents a large-scale corpus for non-task-oriented dialogue response selection, which contains over 27K distinct prompts more than 82K responses collected from social media.

Informativeness Task-Oriented Dialogue Systems

Concurrence-Aware Long Short-Term Sub-Memories for Person-Person Action Recognition

no code implementations3 Jun 2017 Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Yan Song, Zechao Li, Liyan Zhang

To this end, we propose a novel Concurrence-Aware Long Short-Term Sub-Memories (Co-LSTSM) to model the long-term inter-related dynamics between two interacting people on the bounding boxes covering people.

Action Recognition Temporal Action Localization

A Joint Model of Conversational Discourse and Latent Topics on Microblogs

no code implementations11 Sep 2018 Jing Li, Yan Song, Zhongyu Wei, Kam-Fai Wong

To address this issue, we organize microblog messages as conversation trees based on their reposting and replying relations, and propose an unsupervised model that jointly learns word distributions to represent: 1) different roles of conversational discourse, 2) various latent topics in reflecting content information.

Topic Models

Learning User Embeddings from Emails

no code implementations EACL 2017 Yan Song, Chia-Jung Lee

Many important email-related tasks, such as email classification or search, highly rely on building quality document representations (e. g., bag-of-words or key phrases) to assist matching and understanding.

General Classification Recommendation Systems +1

Encoding Conversation Context for Neural Keyphrase Extraction from Microblog Posts

no code implementations NAACL 2018 Yingyi Zhang, Jing Li, Yan Song, Chengzhi Zhang

Existing keyphrase extraction methods suffer from data sparsity problem when they are conducted on short and informal texts, especially microblog messages.

Information Retrieval Keyphrase Extraction +1

Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings

no code implementations NAACL 2018 Yan Song, Shuming Shi, Jing Li, Haisong Zhang

In this paper, we present directional skip-gram (DSG), a simple but effective enhancement of the skip-gram model by explicitly distinguishing left and right context in word prediction.

Learning Word Embeddings Part-Of-Speech Tagging +1

Learning Word Representations with Regularization from Prior Knowledge

no code implementations CONLL 2017 Yan Song, Chia-Jung Lee, Fei Xia

This paper presents a unified framework that leverages pre-learned or external priors, in the form of a regularizer, for enhancing conventional language model-based embedding learning.

Language Modelling Learning Word Embeddings +3

Domain Adaptation for Disease Phrase Matching with Adversarial Networks

no code implementations WS 2018 Miaofeng Liu, Jialong Han, Haisong Zhang, Yan Song

With the development of medical information management, numerous medical data are being classified, indexed, and searched in various systems.

Domain Adaptation Entity Linking +2

A Joint Model of Conversational Discourse Latent Topics on Microblogs

no code implementations CL 2018 Jing Li, Yan Song, Zhongyu Wei, Kam-Fai Wong

To address this issue, we organize microblog messages as conversation trees based on their reposting and replying relations, and propose an unsupervised model that jointly learns word distributions to represent: (1) different roles of conversational discourse, and (2) various latent topics in reflecting content information.

Topic Models

Coordinated Reasoning for Cross-Lingual Knowledge Graph Alignment

no code implementations23 Jan 2020 Kun Xu, Linfeng Song, Yansong Feng, Yan Song, Dong Yu

Existing entity alignment methods mainly vary on the choices of encoding the knowledge graph, but they typically use the same decoding method, which independently chooses the local optimal match for each source entity.

Entity Alignment

Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation

no code implementations ACL 2020 Kun Li, Chengbo Chen, Xiaojun Quan, Qing Ling, Yan Song

In this paper, we formulate the data augmentation as a conditional generation task: generating a new sentence while preserving the original opinion targets and labels.

Data Augmentation Extract Aspect +3

XLST: Cross-lingual Self-training to Learn Multilingual Representation for Low Resource Speech Recognition

no code implementations15 Mar 2021 Zi-Qiang Zhang, Yan Song, Ming-Hui Wu, Xin Fang, Li-Rong Dai

In this paper, we propose a weakly supervised multilingual representation learning framework, called cross-lingual self-training (XLST).

Data Augmentation Representation Learning +2

From Target Tracking to Targeting Track: A Data-Driven Yet Analytical Approach to Joint Target Detection and Tracking

no code implementations20 Apr 2021 Tiancheng Li, Yan Song, Hongqi Fan

This paper addresses the problem of real-time detection and tracking of a non-cooperative target in the challenging scenario with almost no a-priori information about target birth, death, dynamics and detection probability.

Understanding the Spread of COVID-19 Epidemic: A Spatio-Temporal Point Process View

no code implementations24 Jun 2021 Shuang Li, Lu Wang, Xinyun Chen, Yixiang Fang, Yan Song

In this paper, we model the propagation of the COVID-19 as spatio-temporal point processes and propose a generative and intensity-free model to track the spread of the disease.

Imitation Learning Point Processes

Learning to Cluster via Same-Cluster Queries

no code implementations17 Aug 2021 Yi Li, Yan Song, Qin Zhang

We study the problem of learning to cluster data points using an oracle which can answer same-cluster queries.

Meet Changes with Constancy: Learning Invariance in Multi-Source Translation

no code implementations COLING 2020 Jianfeng Liu, Ling Luo, Xiang Ao, Yan Song, Haoran Xu, Jian Ye

Multi-source neural machine translation aims to translate from parallel sources of information (e. g. languages, images, etc.)

Machine Translation NMT +1

Relation Extraction with Word Graphs from N-grams

no code implementations EMNLP 2021 Han Qin, Yuanhe Tian, Yan Song

Most recent studies for relation extraction (RE) leverage the dependency tree of the input sentence to incorporate syntax-driven contextual information to improve model performance, with little attention paid to the limitation where high-quality dependency parsers in most cases unavailable, especially for in-domain scenarios.

Relation Relation Extraction +1

Improving Federated Learning for Aspect-based Sentiment Analysis via Topic Memories

no code implementations EMNLP 2021 Han Qin, Guimin Chen, Yuanhe Tian, Yan Song

Aspect-based sentiment analysis (ABSA) predicts the sentiment polarity towards a particular aspect term in a sentence, which is an important task in real-world applications.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Exploring Word Segmentation and Medical Concept Recognition for Chinese Medical Texts

1 code implementation NAACL (BioNLP) 2021 Yang Liu, Yuanhe Tian, Tsung-Hui Chang, Song Wu, Xiang Wan, Yan Song

Chinese word segmentation (CWS) and medical concept recognition are two fundamental tasks to process Chinese electronic medical records (EMRs) and play important roles in downstream tasks for understanding Chinese EMRs.

Chinese Word Segmentation Model Selection +1

DOMAIN ADAPTATION VIA DISTRIBUTION AND REPRESENTATION MATCHING: A CASE STUDY ON TRAINING DATA SELECTION VIA REINFORCEMENT LEARNING

no code implementations27 Sep 2018 Miaofeng Liu, Yan Song, Hongbin Zou, Tong Zhang

Following the TDS methodology, in this paper, we propose a general data selection framework with representation learning and distribution matching simultaneously for domain adaptation on neural models.

Dependency Parsing Domain Adaptation +4

Expansion-Squeeze-Excitation Fusion Network for Elderly Activity Recognition

no code implementations21 Dec 2021 Xiangbo Shu, Jiawen Yang, Rui Yan, Yan Song

This work focuses on the task of elderly activity recognition, which is a challenging task due to the existence of individual actions and human-object interactions in elderly activities.

Action Recognition Human-Object Interaction Detection

Reinforced Cross-modal Alignment for Radiology Report Generation

1 code implementation Findings (ACL) 2022 Han Qin, Yan Song

In detail, a shared memory is used to record the mappings between visual and textual information, and the proposed reinforced algorithm is performed to learn the signal from the reports to guide the cross-modal alignment even though such reports are not directly related to how images and texts are mapped.

Decision Making Reinforcement Learning (RL) +1

Turning to a Teacher for Timestamp Supervised Temporal Action Segmentation

no code implementations2 Jul 2022 Yang Zhao, Yan Song

To obtain more information to optimize the model, the existing method generated pseudo frame-wise labels iteratively based on the output of a segmentation model and the timestamp annotations.

Action Segmentation Model Optimization +1

Chinese Couplet Generation with Syntactic Information

no code implementations COLING 2022 Yan Song

Chinese couplet generation aims to generate a pair of clauses (usually generating a subsequent clause given an antecedent one) with certain rules (e. g., morphological and syntactical symmetry) adhered and has long been a challenging task with cultural background.

POS

Neural Episodic Control with State Abstraction

no code implementations27 Jan 2023 Zhuo Li, Derui Zhu, Yujing Hu, Xiaofei Xie, Lei Ma, Yan Zheng, Yan Song, Yingfeng Chen, Jianjun Zhao

Generally, episodic control-based approaches are solutions that leverage highly-rewarded past experiences to improve sample efficiency of DRL algorithms.

OpenAI Gym

Unsupervised Neural Aspect Extraction with Sememes

no code implementations IJCAI 2019 Ling Luo, Xiang Ao, Yan Song, Jinyao Li, Xiaopeng Yang, Qing He, Dong Yu

Aspect extraction relies on identifying aspects by discovering coherence among words, which is challenging when word meanings are diversified and processing on short texts.

Aspect Extraction Aspect Term Extraction and Sentiment Classification +1

AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer

no code implementations7 Mar 2023 Kang Li, Yan Song, Li-Rong Dai, Ian McLoughlin, Xin Fang, Lin Liu

In this paper, we propose an effective sound event detection (SED) method based on the audio spectrogram transformer (AST) model, pretrained on the large-scale AudioSet for audio tagging (AT) task, termed AST-SED.

Audio Tagging Event Detection +1

Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection

no code implementations20 May 2023 Xiao-Min Zeng, Yan Song, Zhu Zhuo, Yu Zhou, Yu-Hong Li, Hui Xue, Li-Rong Dai, Ian McLoughlin

In this paper, we propose a joint generative and contrastive representation learning method (GeCo) for anomalous sound detection (ASD).

Contrastive Learning Representation Learning

Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning

no code implementations12 Oct 2023 Junyu Lu, Dixiang Zhang, XiaoJun Wu, Xinyu Gao, Ruyi Gan, Jiaxing Zhang, Yan Song, Pingjian Zhang

Recent advancements enlarge the capabilities of large language models (LLMs) in zero-shot image-to-text generation and understanding by integrating multi-modal inputs.

Image Captioning In-Context Learning +5

Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models

no code implementations27 Oct 2023 Xue Yan, Yan Song, Xinyu Cui, Filippos Christianos, Haifeng Zhang, David Henry Mguni, Jun Wang

To that purpose, we offer a new leader-follower bilevel framework that is capable of learning to ask relevant questions (prompts) and subsequently undertaking reasoning to guide the learning of actions.

Decision Making

Ziya2: Data-centric Learning is All LLMs Need

no code implementations6 Nov 2023 Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, XiaoJun Wu, Dixiang Zhang, Kunhao Pan, Junqing He, Yuanhe Tian, Ping Yang, Qi Yang, Hao Wang, Jiaxing Zhang, Yan Song

Although many such issues are addressed along the line of research on LLMs, an important yet practical limitation is that many studies overly pursue enlarging model sizes without comprehensively analyzing and optimizing the use of pre-training data in their learning process, as well as appropriate organization and leveraging of such data in training LLMs under cost-effective settings.

Improving Image Captioning via Predicting Structured Concepts

no code implementations14 Nov 2023 Ting Wang, Weidong Chen, Yuanhe Tian, Yan Song, Zhendong Mao

Having the difficulty of solving the semantic gap between images and texts for the image captioning task, conventional studies in this area paid some attention to treating semantic concepts as a bridge between the two modalities and improved captioning performance accordingly.

Image Captioning

iDesigner: A High-Resolution and Complex-Prompt Following Text-to-Image Diffusion Model for Interior Design

no code implementations7 Dec 2023 Ruyi Gan, XiaoJun Wu, Junyu Lu, Yuanhe Tian, Dixiang Zhang, Ziwei Wu, Renliang Sun, Chang Liu, Jiaxing Zhang, Pingjian Zhang, Yan Song

However, there are few specialized models in certain domains, such as interior design, which is attributed to the complex textual descriptions and detailed visual elements inherent in design, alongside the necessity for adaptable resolution.

Image Generation

Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support

no code implementations26 Jan 2024 XiaoJun Wu, Dixiang Zhang, Ruyi Gan, Junyu Lu, Ziwei Wu, Renliang Sun, Jiaxing Zhang, Pingjian Zhang, Yan Song

Recent advancements in text-to-image models have significantly enhanced image generation capabilities, yet a notable gap of open-source models persists in bilingual or Chinese language support.

Language Modelling Text-to-Image Generation

Learning Macroeconomic Policies based on Microfoundations: A Stackelberg Mean Field Game Approach

no code implementations14 Mar 2024 Qirui Mi, Zhiyu Zhao, Siyu Xia, Yan Song, Jun Wang, Haifeng Zhang

Effective macroeconomic policies play a crucial role in promoting economic growth and social stability.

Cannot find the paper you are looking for? You can Submit a new open access paper.