Let the Fuzzy Rule Speak: Enhancing In-context Learning Debiasing with Interpretability

26 Dec 2024  ·  Ruixi Lin, Yang You ·

One of the potential failures of large language models (LLMs) is their imbalanced class performances in text classification tasks. With in-context learning (ICL), LLMs yields good accuracy for some classes but low accuracy for others. This imbalance is particularly problematic when misclassifications lead to user dissatisfaction or safety risks. While the root causes may lie in the data, addressing them from the source through training is neither easy nor cost-effective. To delve deeper, the imbalance stems from certain classes consistently receiving disproportionately high ICL probabilities, while others receive lower probabilities, resulting in under-prediction and lower accuracy in the latter. Crucially, probability ranges vary in their impact on the imbalance, enabling precise corrections by range. Therefore, this work introduces an inference-time debiasing method, FuRud (Fuzzy Rule Optimization-based Debiasing), to tackle this issue. FuRud addresses core interpretability challenges by determining why certain classes require corrections and tailoring adjustments for each sample and class probability. Tailored corrections use fuzzy sets with triangular membership functions, because they can transform per-sample class probabilities based on probability ranges. Each class selects one from 19 triangular membership functions, solving a nonlinear integer programming selection problem with simulated annealing, to minimize class accuracy bias (COBias) and maximize overall accuracy without updating LLM parameters. Notably, across seven benchmark datasets, FuRud reduces COBias by more than half (56%), while achieving a relative increase of 21% in overall accuracy, outperforming state-of-the-art debiasing methods.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here