Search Results for author: Mengyue Wu

Found 26 papers, 10 papers with code

A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models

no code implementations28 Feb 2024 Xiujie Song, Mengyue Wu, Kenny Q. Zhu, Chunhao Zhang, Yanyi Chen

Large Vision Language Models (LVLMs), despite their recent success, are hardly comprehensively tested for their cognitive abilities.

Question Answering Visual Question Answering

Phonetic and Lexical Discovery of a Canine Language using HuBERT

no code implementations25 Feb 2024 Xingyuan Li, Sinong Wang, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu

This paper delves into the pioneering exploration of potential communication patterns within dog vocalizations and transcends traditional linguistic analysis barriers, which heavily relies on human priori knowledge on limited datasets to find sound units in dog vocalization.

PsyEval: A Comprehensive Large Language Model Evaluation Benchmark for Mental Health

no code implementations15 Nov 2023 Haoan Jin, Siyuan Chen, Mengyue Wu, Kenny Q. Zhu

Recently, there has been a growing interest in utilizing large language models (LLMs) in mental health research, with studies showcasing their remarkable capabilities, such as disease detection.

Language Modelling Large Language Model +1

Does My Dog ''Speak'' Like Me? The Acoustic Correlation between Pet Dogs and Their Human Owners

no code implementations21 Sep 2023 Jieyi Huang, Chunhao Zhang, YuFei Wang, Mengyue Wu, Kenny Zhu

How hosts language influence their pets' vocalization is an interesting yet underexplored problem.

Towards Lexical Analysis of Dog Vocalizations via Online Videos

no code implementations21 Sep 2023 YuFei Wang, Chunhao Zhang, Jieyi Huang, Mengyue Wu, Kenny Zhu

This study presents a data-driven investigation into the semantics of dog vocalizations via correlating different sound types with consistent semantics.

Lexical Analysis

A Large-scale Dataset for Audio-Language Representation Learning

no code implementations20 Sep 2023 Luoyi Sun, Xuenan Xu, Mengyue Wu, Weidi Xie

To tackle these challenges, we present an innovative and automatic audio caption generation pipeline based on a series of public tools or APIs, and construct a large-scale, high-quality, audio-language dataset, named as Auto-ACD, comprising over 1. 9M audio-text pairs.

Audio captioning Representation Learning +1

Improving Audio Caption Fluency with Automatic Error Correction

no code implementations16 Jun 2023 Hanxue Zhang, Zeyu Xie, Xuenan Xu, Mengyue Wu, Kai Yu

Automated audio captioning (AAC) is an important cross-modality translation task, aiming at generating descriptions for audio clips.

Audio captioning Sentence

LLM-empowered Chatbots for Psychiatrist and Patient Simulation: Application and Evaluation

no code implementations23 May 2023 Siyuan Chen, Mengyue Wu, Kenny Q. Zhu, Kunyao Lan, Zhiling Zhang, Lyuchun Cui

Empowering chatbots in the field of mental health is receiving increasing amount of attention, while there still lacks exploration in developing and evaluating chatbots in psychiatric outpatient scenarios.

Chatbot

Semantic Space Grounded Weighted Decoding for Multi-Attribute Controllable Dialogue Generation

1 code implementation4 May 2023 Zhiling Zhang, Mengyue Wu, Kenny Q. Zhu

Controlling chatbot utterance generation with multiple attributes such as personalities, emotions and dialogue acts is a practically useful but under-studied problem.

Attribute Dialogue Generation

D4: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat

no code implementations24 May 2022 Binwei Yao, Chao Shi, Likai Zou, Lingfeng Dai, Mengyue Wu, Lu Chen, Zhen Wang, Kai Yu

In a depression-diagnosis-directed clinical session, doctors initiate a conversation with ample emotional support that guides the patients to expose their symptoms based on clinical diagnosis criteria.

Response Generation

Symptom Identification for Interpretable Detection of Multiple Mental Disorders

no code implementations23 May 2022 Zhiling Zhang, Siyuan Chen, Mengyue Wu, Kenny Q. Zhu

Mental disease detection (MDD) from social media has suffered from poor generalizability and interpretability, due to lack of symptom modeling.

Psychiatric Scale Guided Risky Post Screening for Early Detection of Depression

1 code implementation19 May 2022 Zhiling Zhang, Siyuan Chen, Mengyue Wu, Kenny Q. Zhu

Depression is a prominent health challenge to the world, and early risk detection (ERD) of depression from online posts can be a promising technique for combating the threat.

Depression Detection

Climate and Weather: Inspecting Depression Detection via Emotion Recognition

no code implementations29 Apr 2022 Wen Wu, Mengyue Wu, Kai Yu

Automatic depression detection has attracted increasing amount of attention but remains a challenging task.

Depression Detection Emotion Recognition

Audio-text Retrieval in Context

no code implementations25 Mar 2022 Siyu Lou, Xuenan Xu, Mengyue Wu, Kai Yu

Using pre-trained audio features and a descriptor-based aggregation method, we build our contextual audio-text retrieval system.

AudioCaps Retrieval +1

THE SJTU SYSTEM FOR DCASE2021 CHALLENGE TASK 6: AUDIO CAPTIONING BASED ON ENCODER PRE-TRAINING AND REINFORCEMENT LEARNING

1 code implementation DCASE Challenge 2021 Xuenan Xu, Zeyu Xie, Mengyue Wu, Kai Yu

This report proposes an audio captioning system for the Detection and Classification of Acoustic Scenes and Events (DCASE) 2021 challenge task Task 6.

Ranked #2 on Audio captioning on Clotho (using extra training data)

Audio captioning Audio Tagging +2

Towards duration robust weakly supervised sound event detection

1 code implementation19 Jan 2021 Heinrich Dinkel, Mengyue Wu, Kai Yu

Our model outperforms other approaches on the DCASE2018 and URBAN-SED datasets without requiring prior duration knowledge.

Data Augmentation Sound Event Detection Sound Audio and Speech Processing

Multiple Sound Sources Localization from Coarse to Fine

1 code implementation ECCV 2020 Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu, Weiyao Lin

How to visually localize multiple sound sources in unconstrained videos is a formidable problem, especially when lack of the pairwise sound-object annotations.

Building Interpretable Interaction Trees for Deep NLP Models

no code implementations29 Jun 2020 Die Zhang, Huilin Zhou, Hao Zhang, Xiaoyi Bao, Da Huo, Ruizhao Chen, Xu Cheng, Mengyue Wu, Quanshi Zhang

This paper proposes a method to disentangle and quantify interactions among words that are encoded inside a DNN for natural language processing.

Sentence

Voice activity detection in the wild via weakly supervised sound event detection

1 code implementation27 Mar 2020 Heinrich Dinkel, Yefei Chen, Mengyue Wu, Kai Yu

We proposed two GPVAD models, one full (GPV-F), trained on 527 Audioset sound events, and one binary (GPV-B), only distinguishing speech and noise.

Sound Audio and Speech Processing

Audio Caption in a Car Setting with a Sentence-Level Loss

1 code implementation31 May 2019 Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu

Captioning has attracted much attention in image and video understanding while a small amount of work examines audio captioning.

Audio captioning Semantic Similarity +5

Text-based depression detection on sparse data

1 code implementation8 Apr 2019 Heinrich Dinkel, Mengyue Wu, Kai Yu

Previous text-based depression detection is commonly based on large user-generated data.

Depression Detection Sentence +1

Audio Caption: Listen and Tell

1 code implementation25 Feb 2019 Mengyue Wu, Heinrich Dinkel, Kai Yu

A baseline encoder-decoder model is provided for both English and Mandarin.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.