Search Results for author: Shikun Zhang

Found 52 papers, 20 papers with code

Improving Embedding-based Large-scale Retrieval via Label Enhancement

no code implementations Findings (EMNLP) 2021 Peiyang Liu, Xi Wang, Sen Wang, Wei Ye, Xiangyu Xi, Shikun Zhang

Current embedding-based large-scale retrieval models are trained with 0-1 hard label that indicates whether a query is relevant to a document, ignoring rich information of the relevance degree.


Label Smoothing for Text Mining

no code implementations COLING 2022 Peiyang Liu, Xiangyu Xi, Wei Ye, Shikun Zhang

This paper presents a novel keyword-based LS method to automatically generate soft labels from hard labels via exploiting the relevance between labels and text instances.

Retrieval text-classification +2

Boosting Model Resilience via Implicit Adversarial Data Augmentation

no code implementations25 Apr 2024 Xiaoling Zhou, Wei Ye, Zhemg Lee, Rui Xie, Shikun Zhang

This insight leads us to develop a meta-learning-based framework for optimizing classifiers with this novel loss, introducing the effects of augmentation while bypassing the explicit augmentation process.

Data Augmentation Long-tail Learning +1

FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models

2 code implementations9 Apr 2024 Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Zhengran Zeng, Wei Ye, Jindong Wang, Yue Zhang, Shikun Zhang

The rapid development of large language model (LLM) evaluation methodologies and datasets has led to a profound challenge: integrating state-of-the-art evaluation techniques cost-effectively while ensuring reliability, reproducibility, and efficiency.

Fairness Language Modelling +1

CodeShell Technical Report

no code implementations23 Mar 2024 Rui Xie, Zhengran Zeng, Zhuohao Yu, Chang Gao, Shikun Zhang, Wei Ye

Through this process, We have curated 100 billion high-quality pre-training data from GitHub.


NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

no code implementations5 Mar 2024 Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao

Specifically, 1) we design a neural codec with factorized vector quantization (FVQ) to disentangle speech waveform into subspaces of content, prosody, timbre, and acoustic details; 2) we propose a factorized diffusion model to generate attributes in each subspace following its corresponding prompt.

Quantization Speech Synthesis

Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models

no code implementations24 Feb 2024 Chaoya Jiang, Wei Ye, Mengfan Dong, Hongrui Jia, Haiyang Xu, Ming Yan, Ji Zhang, Shikun Zhang

Large Vision Language Models exhibit remarkable capabilities but struggle with hallucinations inconsistencies between images and their descriptions.

Hallucination Hallucination Evaluation

KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

2 code implementations23 Feb 2024 Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang

Automatic evaluation methods for large language models (LLMs) are hindered by data contamination, leading to inflated assessments of their effectiveness.

Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection

no code implementations11 Jan 2024 Wei Ye, Chaoya Jiang, Haiyang Xu, Chenhao Ye, Chenliang Li, Ming Yan, Shikun Zhang, Songhang Huang, Fei Huang

Vision Transformers (ViTs) have become increasingly popular in large-scale Vision and Language Pre-training (VLP) models.

When Parameter-efficient Tuning Meets General-purpose Vision-language Models

1 code implementation16 Dec 2023 Yihang Zhai, Haixin Wang, Jianlong Chang, Xinlong Yang, Jinan Sun, Shikun Zhang, Qi Tian

Instruction tuning has shown promising potential for developing general-purpose AI capabilities by using large-scale pre-trained models and boosts growing research to integrate multimodal information for creative applications.

Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks

no code implementations14 Dec 2023 Bo Li, Wei Ye, Quansen Wang, Wen Zhao, Shikun Zhang

Textual label names (descriptions) are typically semantically rich in many natural language understanding (NLU) tasks.

Natural Language Understanding

TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training

1 code implementation14 Dec 2023 Chaoya Jiang, Wei Ye, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Shikun Zhang

Self-supervised Multi-modal Contrastive Learning (SMCL) remarkably advances modern Vision-Language Pre-training (VLP) models by aligning visual and linguistic modalities.

Contrastive Learning Data Augmentation

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

1 code implementation12 Dec 2023 Chaoya Jiang, Haiyang Xu, Mengfan Dong, Jiaxing Chen, Wei Ye, Ming Yan, Qinghao Ye, Ji Zhang, Fei Huang, Shikun Zhang

We first analyzed the representation distribution of textual and visual tokens in MLLM, revealing two important findings: 1) there is a significant gap between textual and visual representations, indicating unsatisfactory cross-modal representation alignment; 2) representations of texts that contain and do not contain hallucinations are entangled, making it challenging to distinguish them.

Contrastive Learning Hallucination +4

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

1 code implementation18 Oct 2023 Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian

For developers and amateurs, it is very difficult to grasp all of these task to satisfy their requirements in music processing, especially considering the huge differences in the representations of music data and the model applicability across platforms among various tasks.

Music Classification

BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization

no code implementations17 Jul 2023 Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang

Specifically, We incorporate a Text-Semantics-Aware Patch Selector (TSPS) into the ViT backbone to perform a coarse-grained visual token extraction and then attach a flexible Transformer-based Patch Abstraction Decoder (PAD) upon the backbone for top-level visual abstraction.

Decoder Text Summarization

EmoGen: Eliminating Subjective Bias in Emotional Music Generation

1 code implementation3 Jul 2023 Chenfei Kang, Peiling Lu, Botao Yu, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian

In this paper, we propose EmoGen, an emotional music generation system that leverages a set of emotion-related music attributes as the bridge between emotion and music, and divides the generation into two stages: emotion-to-attribute mapping with supervised clustering, and attribute-to-music generation with self-supervised learning.

Attribute Clustering +2

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization

2 code implementations8 Jun 2023 Yidong Wang, Zhuohao Yu, Zhengran Zeng, Linyi Yang, Cunxiang Wang, Hao Chen, Chaoya Jiang, Rui Xie, Jindong Wang, Xing Xie, Wei Ye, Shikun Zhang, Yue Zhang

To ensure the reliability of PandaLM, we collect a diverse human-annotated test dataset, where all contexts are generated by humans and labels are aligned with human preferences.

Language Modelling Large Language Model

GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

1 code implementation18 May 2023 Ang Lv, Xu Tan, Peiling Lu, Wei Ye, Shikun Zhang, Jiang Bian, Rui Yan

Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.

Denoising Music Generation

3D Registration with Maximal Cliques

1 code implementation CVPR 2023 Xiyu Zhang, Jiaqi Yang, Shikun Zhang, Yanning Zhang

The key insight is to loosen the previous maximum clique constraint, and mine more local consensus information in a graph for accurate pose hypotheses generation: 1) A compatibility graph is constructed to render the affinity relationship between initial correspondences.

Point Cloud Registration

Exploiting Pseudo Image Captions for Multimodal Summarization

no code implementations9 May 2023 Chaoya Jiang, Rui Xie, Wei Ye, Jinan Sun, Shikun Zhang

Cross-modal contrastive learning in vision language pretraining (VLP) faces the challenge of (partial) false negatives.

Common Sense Reasoning Contrastive Learning +1

Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness

1 code implementation23 Apr 2023 Bo Li, Gexiang Fang, Yang Yang, Quansen Wang, Wei Ye, Wen Zhao, Shikun Zhang

The capability of Large Language Models (LLMs) like ChatGPT to comprehend user intent and provide reasonable responses has made them extremely popular lately.

Exploring Vision-Language Models for Imbalanced Learning

1 code implementation4 Apr 2023 Yidong Wang, Zhuohao Yu, Jindong Wang, Qiang Heng, Hao Chen, Wei Ye, Rui Xie, Xing Xie, Shikun Zhang

However, their performance on imbalanced dataset is relatively poor, where the distribution of classes in the training dataset is skewed, leading to poor performance in predicting minority classes.

Decoder Zero-Shot Learning

Hierarchical Prior Mining for Non-local Multi-View Stereo

no code implementations ICCV 2023 Chunlin Ren, Qingshan Xu, Shikun Zhang, Jiaqi Yang

3) A Hierarchical Prior Mining (HPM) framework, which is used to mine extensive non-local prior information at different scales to assist 3D model recovery, this strategy can achieve a considerable balance between the reconstruction of details and low-textured areas.

Prototypical Mixing and Retrieval-Based Refinement for Label Noise-Resistant Image Retrieval

no code implementations ICCV 2023 Xinlong Yang, Haixin Wang, Jinan Sun, Shikun Zhang, Chong Chen, Xian-Sheng Hua, Xiao Luo

This paper investigates a realistic but understudied problem of image retrieval under label noise, which could lead to severe overfitting or memorization of noisy samples during optimization.

Image Retrieval Memorization +1

BUS: Efficient and Effective Vision-Language Pre-Training with Bottom-Up Patch Summarization.

no code implementations ICCV 2023 Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang

In this paper, we propose a Bottom-Up Patch Summarization approach named BUS which is inspired by the Document Summarization Task in NLP to learn a concise visual summary of lengthy visual token sequences, guided by textual semantics.

Abstractive Text Summarization Decoder +1

Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction

no code implementations29 Dec 2022 Bo Li, Wei Ye, Jinglei Zhang, Shikun Zhang

Specifically, for a given sample, we build a label graph to review candidate labels in the Top-k prediction set and learn the connections between them.

Relation Relation Extraction

Sequence Generation with Label Augmentation for Relation Extraction

1 code implementation29 Dec 2022 Bo Li, Dingyao Yu, Wei Ye, Jinglei Zhang, Shikun Zhang

Sequence generation demonstrates promising performance in recent information extraction efforts, by incorporating large-scale pre-trained Seq2Seq models.

Relation Relation Extraction

Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation

2 code implementations19 Oct 2022 Botao Yu, Peiling Lu, Rui Wang, Wei Hu, Xu Tan, Wei Ye, Shikun Zhang, Tao Qin, Tie-Yan Liu

A recent trend is to use Transformer or its variants in music generation, which is, however, suboptimal, because the full attention cannot efficiently model the typically long music sequences (e. g., over 10, 000 tokens), and the existing models have shortcomings in generating musical repetition structures.

Music Generation

Exploiting Hybrid Semantics of Relation Paths for Multi-hop Question Answering Over Knowledge Graphs

no code implementations COLING 2022 Zile Qiao, Wei Ye, Tong Zhang, Tong Mo, Weiping Li, Shikun Zhang

Answering natural language questions on knowledge graphs (KGQA) remains a great challenge in terms of understanding complex questions via multi-hop reasoning.

Answer Selection Knowledge Graphs +3

Frequency-Aware Contrastive Learning for Neural Machine Translation

no code implementations29 Dec 2021 Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, Wen Zhao

Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective.

Contrastive Learning Machine Translation +3

Cross-document Event Identity via Dense Annotation

1 code implementation CoNLL (EMNLP) 2021 Adithya Pratapa, Zhengzhong Liu, Kimihiro Hasegawa, Linwei Li, Yukari Yamakawa, Shikun Zhang, Teruko Mitamura

To this end, we design a new annotation workflow with careful quality control and an easy-to-use annotation interface.

Unsupervised Out-of-Domain Detection via Pre-trained Transformers

1 code implementation ACL 2021 Keyang Xu, Tongzheng Ren, Shikun Zhang, Yihao Feng, Caiming Xiong

Deployed real-world machine learning applications are often subject to uncontrolled and even potentially malicious inputs.

QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval

no code implementations NAACL 2021 Peiyang Liu, Sen Wang, Xi Wang, Wei Ye, Shikun Zhang

The embedding-based large-scale query-document retrieval problem is a hot topic in the information retrieval (IR) field.

Information Retrieval Retrieval

Multi-Hop Transformer for Document-Level Machine Translation

no code implementations NAACL 2021 Long Zhang, Tong Zhang, Haibo Zhang, Baosong Yang, Wei Ye, Shikun Zhang

Document-level neural machine translation (NMT) has proven to be of profound value for its effectiveness on capturing contextual information.

Document Level Machine Translation Document Translation +4

A Data-Centric Framework for Composable NLP Workflows

1 code implementation EMNLP 2020 Zhengzhong Liu, Guanxiong Ding, Avinash Bukkittu, Mansi Gupta, Pengzhi Gao, Atif Ahmed, Shikun Zhang, Xin Gao, Swapnil Singhavi, Linwei Li, Wei Wei, Zecong Hu, Haoran Shi, Haoying Zhang, Xiaodan Liang, Teruko Mitamura, Eric P. Xing, Zhiting Hu

Empirical natural language processing (NLP) systems in application domains (e. g., healthcare, finance, education) involve interoperation among multiple components, ranging from data ingestion, human annotation, to text retrieval, analysis, generation, and visualization.

Retrieval Text Retrieval

Expectation Synchronization Synthesis in Non-Markovian Open Quantum Systems

no code implementations4 Jan 2021 Shikun Zhang, Kun Liu, Daoyi Dong, Xiaoxue Feng, Feng Pan

In this article, we investigate the problem of engineering synchronization in non-Markovian quantum systems.

Quantum Physics

FSV: Learning to Factorize Soft Value Function for Cooperative Multi-Agent Reinforcement Learning

no code implementations1 Jan 2021 Yueheng Li, Tianhao Zhang, Chen Wang, Jinan Sun, Shikun Zhang, Guangming Xie

We explore energy-based solutions for cooperative multi-agent reinforcement learning (MARL) using the idea of function factorization in centralized training with decentralized execution (CTDE).

Multi-agent Reinforcement Learning reinforcement-learning +3

SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint

1 code implementation9 Dec 2020 Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin

Automatic song writing aims to compose a song (lyric and/or melody) by machine, which is an interesting topic in both academia and industry.


Graph Enhanced Dual Attention Network for Document-Level Relation Extraction

no code implementations COLING 2020 Bo Li, Wei Ye, Zhonghao Sheng, Rui Xie, Xiangyu Xi, Shikun Zhang

Document-level relation extraction requires inter-sentence reasoning capabilities to capture local and global contextual information for multiple relational facts.

Document-level Relation Extraction Relation +1

Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning

no code implementations24 Feb 2020 Wei Ye, Rui Xie, Jinglei Zhang, Tianxiang Hu, Xiaoyin Wang, Shikun Zhang

Since both tasks aim to model the association between natural language and programming language, recent studies have combined these two tasks to improve their performance.

Code Generation Code Summarization +3

Deep Dynamic Boosted Forest

no code implementations19 Apr 2018 Haixin Wang, Xingzhang Ren, Jinan Sun, Wei Ye, Long Chen, Muzhi Yu, Shikun Zhang

Specically, we propose to measure the quality of each leaf node of every decision tree in the random forest to determine hard examples.

Ensemble Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.