Search Results for author: Wenhao Huang

Found 40 papers, 21 papers with code

DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion

no code implementations17 Jul 2024 Huiguo He, Huan Yang, Zixi Tuo, Yuan Zhou, Qiuyue Wang, Yuhang Zhang, Zeyu Liu, Wenhao Huang, Hongyang Chao, Jian Yin

DreamStory consists of (1) an LLM acting as a story director and (2) an innovative Multi-Subject consistent Diffusion model (MSD) for generating consistent multi-subject across the images.

LongIns: A Challenging Long-context Instruction-based Exam for LLMs

no code implementations25 Jun 2024 Shawn Gavin, Tuney Zheng, Jiaheng Liu, Quehry Que, Noah Wang, Jian Yang, Chenchen Zhang, Wenhao Huang, Wenhu Chen, Ge Zhang

To address these issues, we propose the LongIns benchmark dataset, a challenging long-context instruction-based exam for LLMs, which is built based on the existing instruction datasets.

16k 4k

Intensity Confusion Matters: An Intensity-Distance Guided Loss for Bronchus Segmentation

no code implementations23 Jun 2024 Haifan Gong, Wenhao Huang, huan zhang, Yu Wang, Xiang Wan, Hong Shen, Guanbin Li, Haofeng Li

We regard a voxel as a hard sample if it is in: (1) the background and has an intensity value close to the bronchus region; (2) the bronchus region and is of higher intensity than most voxels inside the bronchus; (3) the background region and at a short distance from the bronchus.


GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models

1 code implementation21 Jun 2024 Leyan Wang, Yonggang Jin, Tianhao Shen, Tianyu Zheng, Xinrun Du, Chenchen Zhang, Wenhao Huang, Jiaheng Liu, Shi Wang, Ge Zhang, Liuyu Xiang, Zhaofeng He

As large language models (LLMs) continue to develop and gain widespread application, the ability of LLMs to exhibit empathy towards diverse group identities and understand their perspectives is increasingly recognized as critical.

PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents

no code implementations20 Jun 2024 Junjie Wang, Yin Zhang, Yatai Ji, Yuxiang Zhang, Chunyang Jiang, YuBo Wang, Kang Zhu, Zekun Wang, Tiezhen Wang, Wenhao Huang, Jie Fu, Bei Chen, Qunshu Lin, Minghao Liu, Ge Zhang, Wenhu Chen

Recent advancements in Large Multimodal Models (LMMs) have leveraged extensive multimodal datasets to enhance capabilities in complex knowledge-driven tasks.

MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

no code implementations19 Jun 2024 Shun Wang, Ge Zhang, Han Wu, Tyler Loakman, Wenhao Huang, Chenghua Lin

Machine Translation (MT) has developed rapidly since the release of Large Language Models and current MT evaluation is performed through comparison with reference human translations or by predicting quality scores from human-labeled data.

Machine Translation Translation

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

1 code implementation19 Apr 2024 Wenhao Huang, Chenghao Peng, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Liqian Wen, Zulong Chen

We propose AutoCrawler, a two-stage framework that leverages the hierarchical structure of HTML for progressive understanding.

Action Generation

Yi: Open Foundation Models by 01.AI

1 code implementation7 Mar 2024 01. AI, :, Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, Zonghong Dai

The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models.

Attribute Chatbot +2

m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers

1 code implementation26 Feb 2024 Ka Man Lo, Yiming Liang, Wenyu Du, Yuantao Fan, Zili Wang, Wenhao Huang, Lei Ma, Jie Fu

Additionally, the V-MoE-Base model trained with m2mKD achieves 3. 5% higher accuracy than end-to-end training on ImageNet-1k.

Knowledge Distillation

LLM Agents for Psychology: A Study on Gamified Assessments

no code implementations19 Feb 2024 Qisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang

Psychological measurement is essential for mental health, self-understanding, and personal development.

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

1 code implementation6 Feb 2024 Weiming Ren, Huan Yang, Ge Zhang, Cong Wei, Xinrun Du, Wenhao Huang, Wenhu Chen

To verify the effectiveness of our method, we propose I2V-Bench, a comprehensive evaluation benchmark for I2V generation.

Image to Video Generation

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval

1 code implementation24 Jan 2024 Siwei Wu, Yizhi Li, Kang Zhu, Ge Zhang, Yiming Liang, Kaijing Ma, Chenghao Xiao, Haoran Zhang, Bohao Yang, Wenhu Chen, Wenhao Huang, Noura Al Moubayed, Jie Fu, Chenghua Lin

We further annotate the image-text pairs with two-level subset-subcategory hierarchy annotations to facilitate a more comprehensive evaluation of the baselines.

Benchmarking Image Captioning +3

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

1 code implementation22 Jan 2024 Ge Zhang, Xinrun Du, Bei Chen, Yiming Liang, Tongxu Luo, Tianyu Zheng, Kang Zhu, Yuyang Cheng, Chunpu Xu, Shuyue Guo, Haoran Zhang, Xingwei Qu, Junjie Wang, Ruibin Yuan, Yizhi Li, Zekun Wang, Yudong Liu, Yu-Hsuan Tsai, Fengji Zhang, Chenghua Lin, Wenhao Huang, Wenhu Chen, Jie Fu

We introduce CMMMU, a new Chinese Massive Multi-discipline Multimodal Understanding benchmark designed to evaluate LMMs on tasks demanding college-level subject knowledge and deliberate reasoning in a Chinese context.

Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation

1 code implementation12 Jan 2024 Tianyu Zheng, Shuyue Guo, Xingwei Qu, Jiawei Guo, Weixu Zhang, Xinrun Du, Qi Jia, Chenghua Lin, Wenhao Huang, Wenhu Chen, Jie Fu, Ge Zhang

In this paper, we introduce Kun, a novel approach for creating high-quality instruction-tuning datasets for large language models (LLMs) without relying on manual annotations.

Instruction Following Translation

TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks

1 code implementation1 Oct 2023 Dongfu Jiang, Yishan Li, Ge Zhang, Wenhao Huang, Bill Yuchen Lin, Wenhu Chen

To quantitatively assess our metric, we evaluate its correlation with human ratings on 5 held-in datasets, 2 held-out datasets and show that TIGERScore can achieve the open-source SoTA correlation with human ratings across these datasets and almost approaches GPT-4 evaluator.

Text Generation

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response

1 code implementation15 Sep 2023 Zihao Deng, Yinghao Ma, Yudong Liu, Rongchen Guo, Ge Zhang, Wenhu Chen, Wenhao Huang, Emmanouil Benetos

Large Language Models (LLMs) have shown immense potential in multimodal applications, yet the convergence of textual and musical domains remains not well-explored.

Caption Generation Language Modelling +1

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning

1 code implementation11 Sep 2023 Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

The MAmmoTH models are trained on MathInstruct, our meticulously curated instruction tuning dataset.

Math Mathematical Reasoning

Chinese Open Instruction Generalist: A Preliminary Release

2 code implementations17 Apr 2023 Ge Zhang, Yemin Shi, Ruibo Liu, Ruibin Yuan, Yizhi Li, Siwei Dong, Yu Shu, Zhaoqun Li, Zekun Wang, Chenghua Lin, Wenhao Huang, Jie Fu

Instruction tuning is widely recognized as a key technique for building generalist language models, which has attracted the attention of researchers and the public with the release of InstructGPT~\citep{ouyang2022training} and ChatGPT\footnote{\url{https://chat. openai. com/}}.

PEMP: Leveraging Physics Properties to Enhance Molecular Property Prediction

no code implementations18 Oct 2022 Yuancheng Sun, Yimeng Chen, Weizhi Ma, Wenhao Huang, Kang Liu, ZhiMing Ma, Wei-Ying Ma, Yanyan Lan

In our implementation, we adopt both the state-of-the-art molecule embedding models under the supervised learning paradigm and the pretraining paradigm as the molecule representation module of PEMP, respectively.

Drug Discovery Molecular Property Prediction +2

BronchusNet: Region and Structure Prior Embedded Representation Learning for Bronchus Segmentation and Classification

no code implementations14 May 2022 Wenhao Huang, Haifan Gong, huan zhang, Yu Wang, Haofeng Li, Guanbin Li, Hong Shen

CT-based bronchial tree analysis plays an important role in the computer-aided diagnosis for respiratory diseases, as it could provide structured information for clinicians.

Classification Graph Learning +3

Text Assisted Insight Ranking Using Context-Aware Memory Network

no code implementations13 Nov 2018 Qi Zeng, Liangchen Luo, Wenhao Huang, Yang Tang

Extracting valuable facts or informative summaries from multi-dimensional tables, i. e. insight mining, is an important task in data analysis and business intelligence.

Learning Personalized End-to-End Goal-Oriented Dialog

no code implementations12 Nov 2018 Liangchen Luo, Wenhao Huang, Qi Zeng, Zaiqing Nie, Xu sun

Most existing works on dialog systems only consider conversation content while neglecting the personality of the user the bot is interacting with, which begets several unsolved issues.

Goal-Oriented Dialog

Cannot find the paper you are looking for? You can Submit a new open access paper.