Search Results for author: Di Yin

Found 13 papers, 6 papers with code

FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema

1 code implementation19 Feb 2024 Junru Lu, Siyu An, Min Zhang, Yulan He, Di Yin, Xing Sun

In the quest to facilitate the deep intelligence of Large Language Models (LLMs) accessible in final-end user-bot interactions, the art of prompt crafting emerges as a critical yet complex task for the average user.

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

1 code implementation18 Dec 2023 Bing Wang, Changyu Ren, Jian Yang, Xinnian Liang, Jiaqi Bai, Linzheng Chai, Zhao Yan, Qian-Wen Zhang, Di Yin, Xing Sun, Zhoujun Li

Our framework comprises a core decomposer agent for Text-to-SQL generation with few-shot chain-of-thought reasoning, accompanied by two auxiliary agents that utilize external tools or models to acquire smaller sub-databases and refine erroneous SQL queries.

SQL Parsing Text-To-SQL

VKIE: The Application of Key Information Extraction on Video Text

no code implementations18 Oct 2023 Siyu An, Ye Liu, Haoyuan Peng, Di Yin

Extracting structured information from videos is critical for numerous downstream applications in the industry.

Key Information Extraction

MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation

1 code implementation16 Aug 2023 Junru Lu, Siyu An, Mingbao Lin, Gabriele Pergola, Yulan He, Di Yin, Xing Sun, Yunsheng Wu

We propose MemoChat, a pipeline for refining instructions that enables large language models (LLMs) to effectively employ self-composed memos for maintaining consistent long-range open-domain conversations.

Memorization Retrieval

OSAN: A One-Stage Alignment Network To Unify Multimodal Alignment and Unsupervised Domain Adaptation

no code implementations CVPR 2023 Ye Liu, Lingfeng Qiao, Changchong Lu, Di Yin, Chen Lin, Haoyuan Peng, Bo Ren

An intuitive way to handle these two problems is to fulfill these tasks in two separate stages: aligning modalities followed by domain adaptation, or vice versa.

Unsupervised Domain Adaptation

Grafting Pre-trained Models for Multimodal Headline Generation

no code implementations14 Nov 2022 Lingfeng Qiao, Chen Wu, Ye Liu, Haoyuan Peng, Di Yin, Bo Ren

In this paper, we propose a novel approach to graft the video encoder from the pre-trained video-language model on the generative pre-trained language model.

Headline Generation Language Modelling +1

Unsupervised Extractive Summarization with Heterogeneous Graph Embeddings for Chinese Document

no code implementations9 Nov 2022 Chen Lin, Ye Liu, Siyu An, Di Yin

In the scenario of unsupervised extractive summarization, learning high-quality sentence representations is essential to select salient sentences from the input document.

Extractive Summarization Sentence +2

Leveraging Key Information Modeling to Improve Less-Data Constrained News Headline Generation via Duality Fine-Tuning

no code implementations10 Oct 2022 Zhuoxuan Jiang, Lingfeng Qiao, Di Yin, Shanshan Feng, Bo Ren

Recent language generative models are mostly trained on large-scale datasets, while in some real scenarios, the training datasets are often expensive to obtain and would be small-scale.

Headline Generation Informativeness +1

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

no code implementations4 Jul 2022 Ye Liu, Lingfeng Qiao, Di Yin, Zhuoxuan Jiang, Xinghua Jiang, Deqiang Jiang, Bo Ren

In this paper, from an alternate perspective to overcome the above challenges, we unite these two tasks into one task by a new form of predicting shots link: a link connects two adjacent shots, indicating that they belong to the same scene or category.

Scene Segmentation

RAAT: Relation-Augmented Attention Transformer for Relation Modeling in Document-Level Event Extraction

1 code implementation NAACL 2022 Yuan Liang, Zhuoxuan Jiang, Di Yin, Bo Ren

To further leverage relation information, we introduce a separate event relation prediction task and adopt multi-task learning method to explicitly enhance event extraction performance.

Document-level Event Extraction Event Extraction +3

Contrastive Graph Multimodal Model for Text Classification in Videos

no code implementations6 Jun 2022 Ye Liu, Changchong Lu, Chen Lin, Di Yin, Bo Ren

However, to our knowledge, there is no existing work focused on the second step of video text classification, which will limit the guidance to downstream tasks such as video indexing and browsing.

Contrastive Learning Optical Character Recognition (OCR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.