Search Results for author: Wei Ye

Found 102 papers, 56 papers with code

Label Smoothing for Text Mining

no code implementations COLING 2022 Peiyang Liu, Xiangyu Xi, Wei Ye, Shikun Zhang

This paper presents a novel keyword-based LS method to automatically generate soft labels from hard labels via exploiting the relevance between labels and text instances.

text-classification Text Classification +1

Improving Embedding-based Large-scale Retrieval via Label Enhancement

no code implementations Findings (EMNLP) 2021 Peiyang Liu, Xi Wang, Sen Wang, Wei Ye, Xiangyu Xi, Shikun Zhang

Current embedding-based large-scale retrieval models are trained with 0-1 hard label that indicates whether a query is relevant to a document, ignoring rich information of the relevance degree.

Retrieval

RewardAnything: Generalizable Principle-Following Reward Models

1 code implementation4 Jun 2025 Zhuohao Yu, Jiali Zeng, Weizheng Gu, Yidong Wang, Jindong Wang, Fandong Meng, Jie zhou, Yue Zhang, Shikun Zhang, Wei Ye

To measure this capability, we develop RABench, a comprehensive benchmark for RMs focusing on generalization across diverse principles.

Instruction Following Large Language Model +1

Jigsaw-Puzzles: From Seeing to Understanding to Reasoning in Vision-Language Models

no code implementations27 May 2025 Zesen Lyu, Dandan Zhang, Wei Ye, Fangdi Li, Zhihang Jiang, Yao Yang

Spatial reasoning is a core component of human cognition, enabling individuals to perceive, comprehend, and interact with the physical world.

Diagnostic Spatial Reasoning

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

no code implementations23 May 2025 Deyang Kong, Qi Guo, Xiangyu Xi, Wei Wang, Jingang Wang, Xunliang Cai, Shikun Zhang, Wei Ye

Reinforcement learning exhibits potential in enhancing the reasoning abilities of large language models, yet it is hard to scale for the low sample efficiency during the rollout phase.

Scheduling

MPL: Multiple Programming Languages with Large Language Models for Information Extraction

1 code implementation22 May 2025 Bo Li, Gexiang Fang, Wei Ye, Zhenghua Xu, Jinglei Zhang, Hao Cheng, Shikun Zhang

In this research, we propose \textbf{M}ultiple \textbf{P}rogramming \textbf{L}anguages with large language models for information extraction (abbreviated as \textbf{MPL}), a novel framework that explores the potential of incorporating different PLs in the SFT phase.

Structured Output Generation

Mitigating Spurious Correlations with Causal Logit Perturbation

no code implementations21 May 2025 Xiaoling Zhou, Wei Ye, Rui Xie, Shikun Zhang

{This study attempts to implement causal models via logit perturbations and introduces a novel Causal Logit Perturbation (CLP) framework to train classifiers with generated causal logit perturbations for individual samples, thereby mitigating the spurious associations between non-causal attributes (i. e., image backgrounds) and classes.}

counterfactual Long-tail Learning +1

Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective

1 code implementation15 May 2025 Yutao Mou, Xiao Deng, Yuxiao Luo, Shikun Zhang, Wei Ye

In this paper, we first propose CoV-Eval, a multi-task benchmark covering various tasks such as code completion, vulnerability repair, vulnerability detection and classification, for comprehensive evaluation of LLM code security.

Code Completion Code Generation +1

RI3D: Few-Shot Gaussian Splatting With Repair and Inpainting Diffusion Priors

1 code implementation13 Mar 2025 Avinash Paliwal, Xilong Zhou, Wei Ye, Jinhui Xiong, Rakesh Ranjan, Nima Khademi Kalantari

We demonstrate that by separating the process into two tasks and addressing them with the repair and inpainting models, we produce results with detailed textures in both visible and missing regions that outperform state-of-the-art approaches on a diverse set of scenes with extremely sparse inputs.

3DGS

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

no code implementations3 Mar 2025 Xiangyu Xi, Deyang Kong, Jian Yang, Jiawei Yang, Zhengyu Chen, Wei Wang, Jingang Wang, Xunliang Cai, Shikun Zhang, Wei Ye

Existing pretraining data mixing methods for large language models (LLMs) typically follow a domain-wise methodology, a top-down process that first determines domain weights and then performs uniform data sampling across each domain.

Diversity

All-Optical Nonlinear Diffractive Deep Network for Ultrafast Image Denoising

no code implementations CVPR 2025 Xiaoling Zhou, Zhemg Lee, Wei Ye, Rui Xie, Wenbo Zhang, Guanju Peng, Zongze Li, Shikun Zhang

A new benchmark dataset, termed MIDD, is constructed for mode image denoising, comprising 120K pairs of noisy/noise-free images captured from real fiber communication systems across various transmission lengths.

All Image Denoising

Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation

1 code implementation19 Dec 2024 Zhuohao Yu, Weizheng Gu, Yidong Wang, Xingru Jiang, Zhengran Zeng, Jindong Wang, Wei Ye, Shikun Zhang

Large Language Models excel at code generation yet struggle with complex programming tasks that demand sophisticated reasoning.

Code Generation

Deep Spectral Clustering via Joint Spectral Embedding and Kmeans

1 code implementation15 Dec 2024 Wengang Guo, Wei Ye

It first maps data into the spectral embedding space and then uses Kmeans to find clusters.

Clustering

Robust Multiple Description Neural Video Codec with Masked Transformer for Dynamic and Noisy Networks

no code implementations10 Dec 2024 Xinyue Hu, Wei Ye, Jiaxiang Tang, Eman Ramadan, Zhi-Li Zhang

We propose a novel MDC video codec, NeuralMDC, demonstrating how bidirectional transformers trained for masked token prediction can vastly simplify the design of MDC video codec.

motion prediction

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

1 code implementation CVPR 2025 Hongrui Jia, Chaoya Jiang, Haiyang Xu, Wei Ye, Mengfan Dong, Ming Yan, Ji Zhang, Fei Huang, Shikun Zhang

This forces the model to carefully understand the demonstration images and establish a relationship between the images and the symbols to answer questions correctly.

In-Context Learning

SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types

1 code implementation29 Oct 2024 Yutao Mou, Shikun Zhang, Wei Ye

To overcome these issues, we developed SG-Bench, a novel benchmark to assess the generalization of LLM safety across various tasks and prompt types.

Language Modeling Language Modelling +3

Compressed Depth Map Super-Resolution and Restoration: AIM 2024 Challenge Results

no code implementations24 Sep 2024 Marcos V. Conde, Florin-Alexandru Vasluianu, Jinhui Xiong, Wei Ye, Rakesh Ranjan, Radu Timofte

The increasing demand for augmented reality (AR) and virtual reality (VR) applications highlights the need for efficient depth information processing.

Depth Map Super-Resolution

MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model

no code implementations22 Aug 2024 Chaoya Jiang, Jia Hongrui, Haiyang Xu, Wei Ye, Mengfan Dong, Ming Yan, Ji Zhang, Fei Huang, Shikun Zhang

This paper presents MaVEn, an innovative Multi-granularity Visual Encoding framework designed to enhance the capabilities of Multimodal Large Language Models (MLLMs) in multi-image reasoning.

Language Modeling Language Modelling +3

Refining Corpora from a Model Calibration Perspective for Chinese Spelling Correction

no code implementations22 Jul 2024 Dingyao Yu, Yang An, Wei Ye, Xiongfeng Xiao, Shaoguang Mao, Tao Ge, Shikun Zhang

Specifically, OCR/ASR-based data samples are fed into a well-calibrated CSC model trained on random replacement-based corpora and then filtered based on prediction confidence.

Data Augmentation Optical Character Recognition (OCR) +1

Enhancing In-Context Learning via Implicit Demonstration Augmentation

no code implementations27 Jun 2024 Xiaoling Zhou, Wei Ye, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie, Shikun Zhang

The emergence of in-context learning (ICL) enables large pre-trained language models (PLMs) to make predictions for unseen inputs without updating parameters.

In-Context Learning

Decoupling Forgery Semantics for Generalizable Deepfake Detection

1 code implementation14 Jun 2024 Wei Ye, Xinan He, Feng Ding

The unique forgery semantics and irrelevant content semantics may promote over-fitting and hamper generalization for DeepFake detectors.

DeepFake Detection Face Swapping

Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling

no code implementations12 Jun 2024 Zile Qiao, Wei Ye, Yong Jiang, Tong Mo, Pengjun Xie, Weiping Li, Fei Huang, Shikun Zhang

Retrieval-augmented language models (RALMs) have recently shown great potential in mitigating the limitations of implicit knowledge in LLMs, such as untimely updating of the latest expertise and unreliable retention of long-tail knowledge.

Language Modeling Language Modelling +1

AutoSurvey: Large Language Models Can Automatically Write Surveys

1 code implementation10 Jun 2024 Yidong Wang, Qi Guo, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang

This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence.

Retrieval Survey

Concept Matching with Agent for Out-of-Distribution Detection

1 code implementation27 May 2024 YuXiao Lee, Xiaofeng Cao, Jingcai Guo, Wei Ye, Qing Guo, Yi Chang

The remarkable achievements of Large Language Models (LLMs) have captivated the attention of both academia and industry, transcending their initial role in dialogue generation.

Dialogue Generation Out-of-Distribution Detection +1

A3:Ambiguous Aberrations Captured via Astray-Learning for Facial Forgery Semantic Sublimation

no code implementations24 May 2024 Xinan He, Yue Zhou, Wei Ye, Feng Ding

The primary objective of the proposed method is to blend hybrid forgery semantics derived from high-frequency components into authentic imagery, named aberrations.

DeepFake Detection Face Swapping +1

Deep Hierarchical Graph Alignment Kernels

1 code implementation9 May 2024 Shuhao Tang, Hao Tian, Xiaofeng Cao, Wei Ye

Typical R-convolution graph kernels invoke the kernel functions that decompose graphs into non-isomorphic substructures and compare them.

Position

Generative manufacturing systems using diffusion models and ChatGPT

no code implementations2 May 2024 Xingyu Li, Fei Tao, Wei Ye, Aydin Nassehi, John W. Sutherland

In this study, we introduce Generative Manufacturing Systems (GMS) as a novel approach to effectively manage and coordinate autonomous manufacturing assets, thereby enhancing their responsiveness and flexibility to address a wide array of production objectives and human preferences.

Decision Making Diversity

Boosting Model Resilience via Implicit Adversarial Data Augmentation

no code implementations25 Apr 2024 Xiaoling Zhou, Wei Ye, Zhemg Lee, Rui Xie, Shikun Zhang

This insight leads us to develop a meta-learning-based framework for optimizing classifiers with this novel loss, introducing the effects of augmentation while bypassing the explicit augmentation process.

Data Augmentation Long-tail Learning +2

FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models

1 code implementation9 Apr 2024 Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Zhengran Zeng, Wei Ye, Jindong Wang, Yue Zhang, Shikun Zhang

The rapid development of large language model (LLM) evaluation methodologies and datasets has led to a profound challenge: integrating state-of-the-art evaluation techniques cost-effectively while ensuring reliability, reproducibility, and efficiency.

Fairness GPU +2

CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians

1 code implementation28 Mar 2024 Avinash Paliwal, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, Nima Khademi Kalantari

The field of 3D reconstruction from images has rapidly evolved in the past few years, first with the introduction of Neural Radiance Field (NeRF) and more recently with 3D Gaussian Splatting (3DGS).

3DGS 3D Reconstruction +3

CodeShell Technical Report

no code implementations23 Mar 2024 Rui Xie, Zhengran Zeng, Zhuohao Yu, Chang Gao, Shikun Zhang, Wei Ye

Through this process, We have curated 100 billion high-quality pre-training data from GitHub.

8k HumanEval

NightHaze: Nighttime Image Dehazing via Self-Prior Learning

no code implementations12 Mar 2024 Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Robby T. Tan

By increasing the noise values to approach as high as the pixel intensity values of the glow and light effect blended images, our augmentation becomes severe, resulting in stronger priors.

Image Dehazing Image Enhancement

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

1 code implementation5 Mar 2024 Zeqian Ju, Yuancheng Wang, Kai Shen, Xu Tan, Detai Xin, Dongchao Yang, Yanqing Liu, Yichong Leng, Kaitao Song, Siliang Tang, Zhizheng Wu, Tao Qin, Xiang-Yang Li, Wei Ye, Shikun Zhang, Jiang Bian, Lei He, Jinyu Li, Sheng Zhao

Specifically, 1) we design a neural codec with factorized vector quantization (FVQ) to disentangle speech waveform into subspaces of content, prosody, timbre, and acoustic details; 2) we propose a factorized diffusion model to generate attributes in each subspace following its corresponding prompt.

Quantization Speech Synthesis +2

Generative Retrieval with Large Language Models

no code implementations26 Feb 2024 Ye Wang, Xinrun Xu, Rui Xie, Wenxin Hu, Wei Ye

When completing knowledge-intensive tasks, humans sometimes need not just an answer but also a corresponding reference passage for auxiliary reading.

Position Retrieval

Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models

no code implementations24 Feb 2024 Chaoya Jiang, Hongrui Jia, Wei Ye, Mengfan Dong, Haiyang Xu, Ming Yan, Ji Zhang, Shikun Zhang

Large Vision Language Models exhibit remarkable capabilities but struggle with hallucinations inconsistencies between images and their descriptions.

Hallucination Hallucination Evaluation

KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

2 code implementations23 Feb 2024 Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang

Automatic evaluation methods for large language models (LLMs) are hindered by data contamination, leading to inflated assessments of their effectiveness.

Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection

no code implementations11 Jan 2024 Wei Ye, Chaoya Jiang, Haiyang Xu, Chenhao Ye, Chenliang Li, Ming Yan, Shikun Zhang, Songhang Huang, Fei Huang

Vision Transformers (ViTs) have become increasingly popular in large-scale Vision and Language Pre-training (VLP) models.

NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction

no code implementations1 Jan 2024 Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Shunli Zhang, Robby Tan

However, the intricacies of the real world, particularly with the presence of light effects and low-light regions affected by noise, create significant domain gaps, hampering synthetic-trained models in removing rain streaks properly and leading to over-saturation and color shifts.

Rain Removal Video deraining

HINTED: Hard Instance Enhanced Detector with Mixed-Density Feature Fusion for Sparsely-Supervised 3D Object Detection

1 code implementation CVPR 2024 Qiming Xia, Wei Ye, Hai Wu, Shijia Zhao, Leyuan Xing, Xun Huang, Jinhao Deng, Xin Li, Chenglu Wen, Cheng Wang

Compared with leading sparsely-supervised methods HINTED significantly improves the detection performance on hard instances notably outperforming fully-supervised methods in detecting challenging categories like cyclists.

3D Object Detection object-detection

PICNN: A Pathway towards Interpretable Convolutional Neural Networks

1 code implementation19 Dec 2023 Wengang Guo, Jiayi Yang, Huilin Yin, Qijun Chen, Wei Ye

Experimental results have demonstrated that our method PICNN (the combination of standard CNNs with our proposed pathway) exhibits greater interpretability than standard CNNs while achieving higher or comparable discrimination power.

Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks

no code implementations14 Dec 2023 Bo Li, Wei Ye, Quansen Wang, Wen Zhao, Shikun Zhang

Textual label names (descriptions) are typically semantically rich in many natural language understanding (NLU) tasks.

Natural Language Understanding

COMBHelper: A Neural Approach to Reduce Search Space for Graph Combinatorial Problems

1 code implementation14 Dec 2023 Hao Tian, Sourav Medya, Wei Ye

Combinatorial Optimization (CO) problems over graphs appear routinely in many applications such as in optimizing traffic, viral marketing in social networks, and matching for job allocation.

Combinatorial Optimization Graph Neural Network +2

TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training

1 code implementation14 Dec 2023 Chaoya Jiang, Wei Ye, Haiyang Xu, Qinghao Ye, Ming Yan, Ji Zhang, Shikun Zhang

Self-supervised Multi-modal Contrastive Learning (SMCL) remarkably advances modern Vision-Language Pre-training (VLP) models by aligning visual and linguistic modalities.

Contrastive Learning Data Augmentation

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

1 code implementation CVPR 2024 Chaoya Jiang, Haiyang Xu, Mengfan Dong, Jiaxing Chen, Wei Ye, Ming Yan, Qinghao Ye, Ji Zhang, Fei Huang, Shikun Zhang

We first analyzed the representation distribution of textual and visual tokens in MLLM, revealing two important findings: 1) there is a significant gap between textual and visual representations, indicating unsatisfactory cross-modal representation alignment; 2) representations of texts that contain and do not contain hallucinations are entangled, making it challenging to distinguish them.

Contrastive Learning Hallucination +6

MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

1 code implementation18 Oct 2023 Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian

For developers and amateurs, it is very difficult to grasp all of these task to satisfy their requirements in music processing, especially considering the huge differences in the representations of music data and the model applicability across platforms among various tasks.

AI Agent Music Classification

Enhancing Visibility in Nighttime Haze Images Using Guided APSF and Gradient Adaptive Convolution

1 code implementation3 Aug 2023 Yeying Jin, Beibei Lin, Wending Yan, Yuan Yuan, Wei Ye, Robby T. Tan

In this paper, we enhance the visibility from a single nighttime haze image by suppressing glow and enhancing low-light regions.

Image Dehazing

BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization

no code implementations17 Jul 2023 Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang

Specifically, We incorporate a Text-Semantics-Aware Patch Selector (TSPS) into the ViT backbone to perform a coarse-grained visual token extraction and then attach a flexible Transformer-based Patch Abstraction Decoder (PAD) upon the backbone for top-level visual abstraction.

Decoder Text Summarization

A Survey on Evaluation of Large Language Models

1 code implementation6 Jul 2023 Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie

Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications.

Ethics Survey

EmoGen: Eliminating Subjective Bias in Emotional Music Generation

1 code implementation3 Jul 2023 Chenfei Kang, Peiling Lu, Botao Yu, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian

In this paper, we propose EmoGen, an emotional music generation system that leverages a set of emotion-related music attributes as the bridge between emotion and music, and divides the generation into two stages: emotion-to-attribute mapping with supervised clustering, and attribute-to-music generation with self-supervised learning.

Attribute Clustering +2

Exploiting Pseudo Future Contexts for Emotion Recognition in Conversations

1 code implementation27 Jun 2023 Yinyi Wei, Shuaipeng Liu, Hailei Yan, Wei Ye, Tong Mo, Guanglu Wan

Specifically, for an utterance, we generate its future context with pre-trained language models, potentially containing extra beneficial knowledge in a conversational form homogeneous with the historical ones.

Emotion Recognition

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization

2 code implementations8 Jun 2023 Yidong Wang, Zhuohao Yu, Zhengran Zeng, Linyi Yang, Cunxiang Wang, Hao Chen, Chaoya Jiang, Rui Xie, Jindong Wang, Xing Xie, Wei Ye, Shikun Zhang, Yue Zhang

To ensure the reliability of PandaLM, we collect a diverse human-annotated test dataset, where all contexts are generated by humans and labels are aligned with human preferences.

Language Modelling Large Language Model

GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

1 code implementation18 May 2023 Ang Lv, Xu Tan, Peiling Lu, Wei Ye, Shikun Zhang, Jiang Bian, Rui Yan

Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.

Denoising Music Generation

Exploiting Pseudo Image Captions for Multimodal Summarization

no code implementations9 May 2023 Chaoya Jiang, Rui Xie, Wei Ye, Jinan Sun, Shikun Zhang

Cross-modal contrastive learning in vision language pretraining (VLP) faces the challenge of (partial) false negatives.

Common Sense Reasoning Contrastive Learning +1

Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness

1 code implementation23 Apr 2023 Bo Li, Gexiang Fang, Yang Yang, Quansen Wang, Wei Ye, Wen Zhao, Shikun Zhang

The capability of Large Language Models (LLMs) like ChatGPT to comprehend user intent and provide reasonable responses has made them extremely popular lately.

Exploring Vision-Language Models for Imbalanced Learning

1 code implementation4 Apr 2023 Yidong Wang, Zhuohao Yu, Jindong Wang, Qiang Heng, Hao Chen, Wei Ye, Rui Xie, Xing Xie, Shikun Zhang

However, their performance on imbalanced dataset is relatively poor, where the distribution of classes in the training dataset is skewed, leading to poor performance in predicting minority classes.

Decoder zero-shot-classification +1

On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective

1 code implementation22 Feb 2023 Jindong Wang, Xixu Hu, Wenxin Hou, Hao Chen, Runkai Zheng, Yidong Wang, Linyi Yang, Haojun Huang, Wei Ye, Xiubo Geng, Binxin Jiao, Yue Zhang, Xing Xie

In this paper, we conduct a thorough evaluation of the robustness of ChatGPT from the adversarial and out-of-distribution (OOD) perspective.

Adversarial Robustness Chatbot +1

BUS: Efficient and Effective Vision-Language Pre-Training with Bottom-Up Patch Summarization.

no code implementations ICCV 2023 Chaoya Jiang, Haiyang Xu, Wei Ye, Qinghao Ye, Chenliang Li, Ming Yan, Bin Bi, Shikun Zhang, Fei Huang, Songfang Huang

In this paper, we propose a Bottom-Up Patch Summarization approach named BUS which is inspired by the Document Summarization Task in NLP to learn a concise visual summary of lengthy visual token sequences, guided by textual semantics.

Abstractive Text Summarization Decoder +1

Sequence Generation with Label Augmentation for Relation Extraction

1 code implementation29 Dec 2022 Bo Li, Dingyao Yu, Wei Ye, Jinglei Zhang, Shikun Zhang

Sequence generation demonstrates promising performance in recent information extraction efforts, by incorporating large-scale pre-trained Seq2Seq models.

Relation Relation Extraction

Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction

no code implementations29 Dec 2022 Bo Li, Wei Ye, Jinglei Zhang, Shikun Zhang

Specifically, for a given sample, we build a label graph to review candidate labels in the Top-k prediction set and learn the connections between them.

Prediction Relation +1

MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts

1 code implementation25 Nov 2022 Xiangyu Xi, Jianwei Lv, Shuaipeng Liu, Wei Ye, Fan Yang, Guanglu Wan

As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service.

Articles Event Detection

Consistent Direct Time-of-Flight Video Depth Super-Resolution

1 code implementation CVPR 2023 Zhanghao Sun, Wei Ye, Jinhui Xiong, Gyeongmin Choe, Jialiang Wang, Shuochen Su, Rakesh Ranjan

We believe the methods and dataset are beneficial to a broad community as dToF depth sensing is becoming mainstream on mobile devices.

Super-Resolution

DeS3: Adaptive Attention-driven Self and Soft Shadow Removal using ViT Similarity

1 code implementation15 Nov 2022 Yeying Jin, Wei Ye, Wenhan Yang, Yuan Yuan, Robby T. Tan

Most existing methods rely on binary shadow masks, without considering the ambiguous boundaries of soft and self shadows.

Image Shadow Removal Shadow Removal

Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation

1 code implementation19 Oct 2022 Botao Yu, Peiling Lu, Rui Wang, Wei Hu, Xu Tan, Wei Ye, Shikun Zhang, Tao Qin, Tie-Yan Liu

A recent trend is to use Transformer or its variants in music generation, which is, however, suboptimal, because the full attention cannot efficiently model the typically long music sequences (e. g., over 10, 000 tokens), and the existing models have shortcomings in generating musical repetition structures.

Music Generation

Exploiting Hybrid Semantics of Relation Paths for Multi-hop Question Answering Over Knowledge Graphs

no code implementations COLING 2022 Zile Qiao, Wei Ye, Tong Zhang, Tong Mo, Weiping Li, Shikun Zhang

Answering natural language questions on knowledge graphs (KGQA) remains a great challenge in terms of understanding complex questions via multi-hop reasoning.

Answer Selection Knowledge Graphs +3

Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

1 code implementation15 Aug 2022 Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Xiang Li, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides

Beyond classification, Conv-Adapter can generalize to detection and segmentation tasks with more than 50% reduction of parameters but comparable performance to the traditional full fine-tuning.

Transfer Learning

Multi-scale Wasserstein Shortest-path Graph Kernels for Graph Classification

1 code implementation2 Jun 2022 Wei Ye, Hao Tian, Qijun Chen

However, the existing R-convolution graph kernels cannot resolve both of the two challenges: 1) Comparing graphs at multiple different scales, and 2) Considering the distributions of substructures when computing the kernel matrix.

Graph Classification

A Low-Cost, Controllable and Interpretable Task-Oriented Chatbot: With Real-World After-Sale Services as Example

no code implementations13 May 2022 Xiangyu Xi, Chenxu Lv, Yuncheng Hua, Wei Ye, Chaobo Sun, Shuaipeng Liu, Fan Yang, Guanglu Wan

Though widely used in industry, traditional task-oriented dialogue systems suffer from three bottlenecks: (i) difficult ontology construction (e. g., intents and slots); (ii) poor controllability and interpretability; (iii) annotation-hungry.

Chatbot Task-Oriented Dialogue Systems

Incorporating Heterophily into Graph Neural Networks for Graph Classification

1 code implementation15 Mar 2022 Jiayi Yang, Sourav Medya, Wei Ye

Graph Neural Networks (GNNs) often assume strong homophily for graph classification, seldom considering heterophily, which means connected nodes tend to have different class labels and dissimilar features.

Graph Classification

Graph Neural Diffusion Networks for Semi-supervised Learning

1 code implementation24 Jan 2022 Wei Ye, Zexi Huang, Yunqi Hong, Ambuj Singh

To solve these two issues, we propose a new graph neural network called GND-Nets (for Graph Neural Diffusion Networks) that exploits the local and global neighborhood information of a vertex in a single layer.

Graph Neural Network

Modeling Human-AI Team Decision Making

1 code implementation8 Jan 2022 Wei Ye, Francesco Bullo, Noah Friedkin, Ambuj K Singh

AI and humans bring complementary skills to group deliberations.

Decision Making

Frequency-Aware Contrastive Learning for Neural Machine Translation

no code implementations29 Dec 2021 Tong Zhang, Wei Ye, Baosong Yang, Long Zhang, Xingzhang Ren, Dayiheng Liu, Jinan Sun, Shikun Zhang, Haibo Zhang, Wen Zhao

Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective.

Contrastive Learning Diversity +5

Temporally Consistent Online Depth Estimation in Dynamic Scenes

no code implementations17 Nov 2021 Zhaoshuo Li, Wei Ye, Dilin Wang, Francis X. Creighton, Russell H. Taylor, Ganesh Venkatesh, Mathias Unberath

We present a framework named Consistent Online Dynamic Depth (CODD) to produce temporally consistent depth estimates in dynamic scenes in an online setting.

Stereo Depth Estimation

Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation

1 code implementation CVPR 2022 Jiaqi Gu, Hyoukjun Kwon, Dilin Wang, Wei Ye, Meng Li, Yu-Hsin Chen, Liangzhen Lai, Vikas Chandra, David Z. Pan

Therefore, we propose HRViT, which enhances ViTs to learn semantically-rich and spatially-precise multi-scale representations by integrating high-resolution multi-branch architectures with ViTs.

image-classification Image Classification +4

Deep Embedded K-Means Clustering

2 code implementations30 Sep 2021 Wengang Guo, Kaiyan Lin, Wei Ye

To this end, we discard the decoder and propose a greedy method to optimize the representation.

Clustering Decoder +2

QuadrupletBERT: An Efficient Model For Embedding-Based Large-Scale Retrieval

no code implementations NAACL 2021 Peiyang Liu, Sen Wang, Xi Wang, Wei Ye, Shikun Zhang

The embedding-based large-scale query-document retrieval problem is a hot topic in the information retrieval (IR) field.

Information Retrieval Retrieval

Multi-Hop Transformer for Document-Level Machine Translation

no code implementations NAACL 2021 Long Zhang, Tong Zhang, Haibo Zhang, Baosong Yang, Wei Ye, Shikun Zhang

Document-level neural machine translation (NMT) has proven to be of profound value for its effectiveness on capturing contextual information.

Document Level Machine Translation Document Translation +4

SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint

1 code implementation9 Dec 2020 Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin

Automatic song writing aims to compose a song (lyric and/or melody) by machine, which is an interesting topic in both academia and industry.

Sentence

Graph Enhanced Dual Attention Network for Document-Level Relation Extraction

no code implementations COLING 2020 Bo Li, Wei Ye, Zhonghao Sheng, Rui Xie, Xiangyu Xi, Shikun Zhang

Document-level relation extraction requires inter-sentence reasoning capabilities to capture local and global contextual information for multiple relational facts.

Document-level Relation Extraction Relation +1

Learning Deep Graph Representations via Convolutional Neural Networks

1 code implementation5 Apr 2020 Wei Ye, Omid Askarisichani, Alex Jones, Ambuj Singh

The learned deep representation for a graph is a dense and low-dimensional vector that captures complex high-order interactions in a vertex neighborhood.

General Classification Graph Classification

Incorporating User's Preference into Attributed Graph Clustering

1 code implementation24 Mar 2020 Wei Ye, Dominik Mautz, Christian Boehm, Ambuj Singh, Claudia Plant

In contrast to global clustering, local clustering aims to find only one cluster that is concentrating on the given seed vertex (and also on the designated attributes for attributed graphs).

Attribute Clustering +1

Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning

no code implementations24 Feb 2020 Wei Ye, Rui Xie, Jinglei Zhang, Tianxiang Hu, Xiaoyin Wang, Shikun Zhang

Since both tasks aim to model the association between natural language and programming language, recent studies have combined these two tasks to improve their performance.

Code Generation Code Summarization +3

Tree++: Truncated Tree Based Graph Kernels

1 code implementation23 Feb 2020 Wei Ye, Zhen Wang, Rachel Redberg, Ambuj Singh

At the heart of Tree++ is a graph kernel called the path-pattern graph kernel.

Graph Similarity

PKUSE at SemEval-2019 Task 3: Emotion Detection with Emotion-Oriented Neural Attention Network

no code implementations SEMEVAL 2019 Luyao Ma, Long Zhang, Wei Ye, Wenhui Hu

This paper presents the system in SemEval-2019 Task 3, {``}EmoContext: Contextual Emotion Detection in Text{''}.

Deep Dynamic Boosted Forest

no code implementations19 Apr 2018 Haixin Wang, Xingzhang Ren, Jinan Sun, Wei Ye, Long Chen, Muzhi Yu, Shikun Zhang

Specically, we propose to measure the quality of each leaf node of every decision tree in the random forest to determine hard examples.

Ensemble Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.