Search Results for author: Yu-Feng Li

Found 41 papers, 15 papers with code

Unlabeled Data or Pre-trained Model: Rethinking Semi-Supervised Learning and Pretrain-Finetuning

no code implementations19 May 2025 Song-Lin Lv, Rui Zhu, Yu-Feng Li, Lan-Zhe Guo

Therefore, a question naturally occurs: \emph{When the labeled data is scarce in the target tasks, should we exploit unlabeled data or pre-trained models?}

Image Classification

Curriculum Abductive Learning

no code implementations18 May 2025 Wen-Chao Hu, Qi-Jie Li, Lin-Han Jia, Cunjing Ge, Yu-Feng Li, Yuan Jiang, Zhi-Hua Zhou

Abductive Learning (ABL) integrates machine learning with logical reasoning in a loop: a learning model predicts symbolic concept labels from raw inputs, which are revised through abduction using domain knowledge and then fed back for retraining.

Logical Reasoning

LIFT+: Lightweight Fine-Tuning for Long-Tail Learning

1 code implementation17 Apr 2025 Jiang-Xin Shi, Tong Wei, Yu-Feng Li

The fine-tuning paradigm has emerged as a prominent approach for addressing long-tail learning tasks in the era of foundation models.

Data Augmentation Long-tail Learning

Micro Text Classification Based on Balanced Positive-Unlabeled Learning

1 code implementation17 Mar 2025 Lin-Han Jia, Lan-Zhe Guo, Zhi Zhou, Si-Ye Han, Zi-Wen Li, Yu-Feng Li

In real-world text classification tasks, negative texts often contain a minimal proportion of negative content, which is especially problematic in areas like text quality control, legal risk screening, and sensitive information interception.

text-classification Text Classification

LawGPT: Knowledge-Guided Data Generation and Its Application to Legal LLM

1 code implementation10 Feb 2025 Zhi Zhou, Kun-Yang Yu, Shi-Yu Tian, Jiang-Xin Shi, Xiao-Wen Yang, Pengxiao Song, Yi-Xuan Jin, Lan-Zhe Guo, Yu-Feng Li

To address these limitations, we study data generation for legal reasoning to improve the legal reasoning performance of open-source LLMs with the help of proprietary LLMs.

Legal Reasoning

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models

1 code implementation6 Feb 2025 Xiao-Wen Yang, Xuan-Yi Zhu, Wen-Da Wei, Ding-Chu Zhang, Jie-Jing Shao, Zhi Zhou, Lan-Zhe Guo, Yu-Feng Li

The integration of slow-thinking mechanisms into large language models (LLMs) offers a promising way toward achieving Level 2 AGI Reasoners, as exemplified by systems like OpenAI's o1.

Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning

no code implementations1 Feb 2025 Zhi Zhou, Tan Yuhao, Zenan Li, Yuan YAO, Lan-Zhe Guo, Xiaoxing Ma, Yu-Feng Li

In this paper, we present the first theoretical error decomposition analysis of these techniques, breaking down their error into estimation error and model error.

Contrast-Aware Calibration for Fine-Tuned CLIP: Leveraging Image-Text Alignment

no code implementations31 Jan 2025 Song-Lin Lv, Yu-Yang Chen, Zhi Zhou, Yu-Feng Li, Lan-Zhe Guo

Vision-language models (VLMs), such as CLIP, have demonstrated exceptional generalization capabilities and can quickly adapt to downstream tasks through prompt fine-tuning.

TabFSBench: Tabular Benchmark for Feature Shifts in Open Environment

1 code implementation31 Jan 2025 Zi-Jian Cheng, Zi-Yi Jia, Zhi Zhou, Lan-Zhe Guo, Yu-Feng Li

TabFSBench evaluates impacts of four distinct feature-shift scenarios on four tabular model categories across various datasets and assesses the performance of large language models (LLMs) and tabular LLMs in the tabular benchmark for the first time.

Vision-Language Model Selection and Reuse for Downstream Adaptation

no code implementations30 Jan 2025 Hao-Zhe Tan, Zhi Zhou, Yu-Feng Li, Lan-Zhe Guo

The proposal is highly computationally efficient and growable since the model labeling process is completed target task independent and the ability could grow with the number of candidate VLMs.

Language Modeling Language Modelling +1

Robust Semi-Supervised Learning in Open Environments

no code implementations24 Dec 2024 Lan-Zhe Guo, Lin-Han Jia, Jie-Jing Shao, Yu-Feng Li

Conventional SSL studies typically assume close environments where important factors (e. g., label, feature, distribution) between labeled and unlabeled data are consistent.

ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning

no code implementations18 Dec 2024 Jie-Jing Shao, Xiao-Wen Yang, Bo-Wen Zhang, Baizhi Chen, Wen-Da Wei, Guohao Cai, Zhenhua Dong, Lan-Zhe Guo, Yu-Feng Li

Recent advances in LLMs, particularly in language reasoning and tool integration, have rapidly sparked the real-world development of Language Agents.

You Only Submit One Image to Find the Most Suitable Generative Model

no code implementations16 Dec 2024 Zhi Zhou, Lan-Zhe Guo, Peng-Xiao Song, Yu-Feng Li

In this paper, we propose a novel setting called Generative Model Identification (GMI), which aims to enable the user to identify the most appropriate generative model(s) for the user's requirements from a large number of candidate models efficiently.

Image Generation Text Matching

Fully Test-time Adaptation for Tabular Data

no code implementations14 Dec 2024 Zhi Zhou, Kun-Yang Yu, Lan-Zhe Guo, Yu-Feng Li

To this end, we propose the Fully Test-time Adaptation for Tabular data, namely FTAT, which enables FTTA methods to robustly optimize the label distribution of predictions, adapt to shifted covariate distributions, and suit a variety of tasks and models effectively.

Data Augmentation Test-time Adaptation

Neuro-Symbolic Data Generation for Math Reasoning

no code implementations6 Dec 2024 Zenan Li, Zhi Zhou, Yuan YAO, Yu-Feng Li, Chun Cao, Fan Yang, Xian Zhang, Xiaoxing Ma

A critical question about Large Language Models (LLMs) is whether their apparent deficiency in mathematical reasoning is inherent, or merely a result of insufficient exposure to high-quality mathematical data.

Diversity Math +1

Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation

1 code implementation27 Nov 2024 Jie-Jing Shao, Hao-Ran Hao, Xiao-Wen Yang, Yu-Feng Li

In contrast, traditional symbolic planning excels in long-horizon tasks through logical reasoning over human-defined symbolic spaces but struggles to handle observations beyond symbolic states, such as high-dimensional visual inputs encountered in real-world scenarios.

Imitation Learning Logical Reasoning

RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models

1 code implementation17 Oct 2024 Haoran Hao, Jiaming Han, Changsheng Li, Yu-Feng Li, Xiangyu Yue

To further improve generation quality and alignment with user-specific information, we design a pipeline for data collection and create a specialized dataset for personalized training of MLLMs.

Image Captioning Question Answering +1

Vision-Language Models are Strong Noisy Label Detectors

1 code implementation29 Sep 2024 Tong Wei, Hao-Tian Li, Chun-Shu Li, Jiang-Xin Shi, Yu-Feng Li, Min-Ling Zhang

The proposed framework establishes a noisy label detector by learning positive and negative textual prompts for each class.

Denoising Image Classification +1

Enabling Small Models for Zero-Shot Selection and Reuse through Model Label Learning

no code implementations21 Aug 2024 Jia Zhang, Zhi Zhou, Lan-Zhe Guo, Yu-Feng Li

In this paper, we attempt to demonstrate that by constructing a model hub and aligning models with their functionalities using model labels, new tasks can be solved in a zero-shot manner by effectively selecting and reusing models in the hub.

Image Classification Zero-Shot Learning

Offline Imitation Learning with Model-based Reverse Augmentation

no code implementations18 Jun 2024 Jie-Jing Shao, Hao-Sen Shi, Lan-Zhe Guo, Yu-Feng Li

Specifically, we build a reverse dynamic model from the offline demonstrations, which can efficiently generate trajectories leading to the expert-observed states in a self-paced style.

Imitation Learning model

Efficient and Long-Tailed Generalization for Pre-trained Vision-Language Model

1 code implementation18 Jun 2024 Jiang-Xin Shi, Chi Zhang, Tong Wei, Yu-Feng Li

For efficient adaptation, we treat the CLIP model as a black box and leverage the extracted features to obtain visual and textual prototypes for prediction.

Image-text matching Language Modeling +2

LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

2 code implementations7 Jun 2024 Zhi Zhou, Jiang-Xin Shi, Peng-Xiao Song, Xiao-Wen Yang, Yi-Xuan Jin, Lan-Zhe Guo, Yu-Feng Li

Large language models (LLMs), including both proprietary and open-source models, have showcased remarkable capabilities in addressing a wide range of downstream tasks.

Language Modeling Language Modelling +1

Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions

no code implementations7 Jun 2024 Shi-Yu Tian, Zhi Zhou, Lin-Han Jia, Lan-Zhe Guo, Yu-Feng Li

To further study this problem, we develop a benchmark called Problems with Missing and Contradictory conditions (PMC) and introduce two novel metrics to evaluate the performance of few-shot prompting methods in these scenarios.

Hallucination Mathematical Reasoning

DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection

1 code implementation1 Jun 2024 Zhi Zhou, Ming Yang, Jiang-Xin Shi, Lan-Zhe Guo, Yu-Feng Li

In this paper, we explore a problem setting called Open-world Prompt Tuning (OPT), which involves tuning prompts on base classes and evaluating on a combination of base and new classes.

Out-of-Distribution Detection

Investigating the Limitation of CLIP Models: The Worst-Performing Categories

no code implementations5 Oct 2023 Jie-Jing Shao, Jiang-Xin Shi, Xiao-Wen Yang, Lan-Zhe Guo, Yu-Feng Li

Contrastive Language-Image Pre-training (CLIP) provides a foundation model by integrating natural language into visual concepts, enabling zero-shot recognition on downstream tasks.

Prompt Engineering Zero-Shot Learning

Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts

1 code implementation18 Sep 2023 Jiang-Xin Shi, Tong Wei, Zhi Zhou, Jie-Jing Shao, Xin-Yan Han, Yu-Feng Li

The fine-tuning paradigm in addressing long-tail learning tasks has sparked significant interest since the emergence of foundation models.

 Ranked #1 on Long-tail Learning on CIFAR-100-LT (ρ=100) (using extra training data)

Fine-Grained Image Classification Long-tail learning with class descriptors

A Survey on Extreme Multi-label Learning

4 code implementations8 Oct 2022 Tong Wei, Zhen Mao, Jiang-Xin Shi, Yu-Feng Li, Min-Ling Zhang

Multi-label learning has attracted significant attention from both academic and industry field in recent decades.

Multi-Label Learning Survey

LAMDA-SSL: Semi-Supervised Learning in Python

1 code implementation9 Aug 2022 Lin-Han Jia, Lan-Zhe Guo, Zhi Zhou, Yu-Feng Li

The second part shows the usage of LAMDA-SSL by abundant examples in detail.

Robust Deep Semi-Supervised Learning: A Brief Introduction

no code implementations12 Feb 2022 Lan-Zhe Guo, Zhi Zhou, Yu-Feng Li

Semi-supervised learning (SSL) is the branch of machine learning that aims to improve learning performance by leveraging unlabeled data when labels are insufficient.

STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data

no code implementations NeurIPS 2021 Zhi Zhou, Lan-Zhe Guo, Zhanzhan Cheng, Yu-Feng Li, ShiLiang Pu

However, in many real-world applications, it is desirable to have SSL algorithms that not only classify the samples drawn from the same distribution of labeled data but also detect out-of-distribution (OOD) samples drawn from an unknown distribution.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Prototypical Classifier for Robust Class-Imbalanced Learning

no code implementations22 Oct 2021 Tong Wei, Jiang-Xin Shi, Yu-Feng Li, Min-Ling Zhang

Deep neural networks have been shown to be very powerful methods for many supervised learning tasks.

Learning with noisy labels

Dash: Semi-Supervised Learning with Dynamic Thresholding

no code implementations1 Sep 2021 Yi Xu, Lei Shang, Jinxing Ye, Qi Qian, Yu-Feng Li, Baigui Sun, Hao Li, Rong Jin

In this work we develop a simple yet powerful framework, whose key idea is to select a subset of training examples from the unlabeled data when performing existing SSL methods so that only the unlabeled examples with pseudo labels related to the labeled data will be used to train models.

Semi-Supervised Image Classification

Robust Long-Tailed Learning under Label Noise

no code implementations26 Aug 2021 Tong Wei, Jiang-Xin Shi, Wei-Wei Tu, Yu-Feng Li

To overcome this limitation, we establish a new prototypical noise detection method by designing a distance-based metric that is resistant to label noise.

Image Classification

NGC: A Unified Framework for Learning with Open-World Noisy Data

no code implementations ICCV 2021 Zhi-Fan Wu, Tong Wei, Jianwen Jiang, Chaojie Mao, Mingqian Tang, Yu-Feng Li

The existence of noisy data is prevalent in both the training and testing phases of machine learning systems, which inevitably leads to the degradation of model performance.

Image Classification

Improving Tail Label Prediction for Extreme Multi-label Learning

no code implementations1 Jan 2021 Tong Wei, Wei-Wei Tu, Yu-Feng Li

Extreme multi-label learning (XML) works to annotate objects with relevant labels from an extremely large label set.

Multi-Label Learning

Weakly Supervised Learning Meets Ride-Sharing User Experience Enhancement

no code implementations19 Jan 2020 Lan-Zhe Guo, Feng Kuang, Zhang-Xun Liu, Yu-Feng Li, Nan Ma, Xiao-Hu Qie

For example, in user experience enhancement from Didi, one of the largest online ride-sharing platforms, the ride comment data contains severe label noise (due to the subjective factors of passengers) and severe label distribution bias (due to the sampling bias).

Weakly-supervised Learning

Reliable Weakly Supervised Learning: Maximize Gain and Maintain Safeness

no code implementations22 Apr 2019 Lan-Zhe Guo, Yu-Feng Li, Ming Li, Jin-Feng Yi, Bo-Wen Zhou, Zhi-Hua Zhou

We guide the optimization of label quality through a small amount of validation data, and to ensure the safeness of performance while maximizing performance gain.

Weakly-supervised Learning

Convex and Scalable Weakly Labeled SVMs

no code implementations6 Mar 2013 Yu-Feng Li, Ivor W. Tsang, James T. Kwok, Zhi-Hua Zhou

In this paper, we study the problem of learning from weakly labeled data, where labels of the training examples are incomplete.

Clustering Information Retrieval +1

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

no code implementations NeurIPS 2012 Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou

Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning.

Cannot find the paper you are looking for? You can Submit a new open access paper.