Learning Conformal Abstention Policies for Adaptive Risk Management in Large LLMs & VLMs

Abstract

Large Language and Vision-Language Models (LLMs/VLMs) are increasingly used in safety-critical applications, yet their opaque decision-making complicates risk assessment and reliability. Uncertainty quantification (UQ) helps assess prediction confidence and enables abstention when uncertainty is high. Conformal prediction (CP), a leading UQ method, provides statistical guarantees but relies on static thresholds, which fail to adapt to task complexity and evolving data distributions, leading to suboptimal trade-offs in accuracy, coverage, and informativeness.

To address this, we propose learnable conformal abstention, integrating reinforcement learning (RL) with CP to optimize abstention thresholds dynamically. By treating CP thresholds as adaptive actions, our approach balances multiple objectives, minimizing prediction set size while maintaining reliable coverage. Extensive evaluations across diverse LLM/VLM benchmarks show our method outperforms Least Ambiguous Classifiers (LAC) and Adaptive Prediction Sets (APS), improving accuracy by up to 3.2%, boosting AUROC for hallucination detection by 22.19%, enhancing uncertainty-guided selective generation (AUARC) by 21.17%, and reducing calibration error by 70%-85%. These improvements hold across multiple models and datasets while consistently meeting the 90% coverage target, establishing our approach as a more effective and flexible solution for reliable decision-making in safety-critical applications.

Overview

Our work presents an adaptive approach to risk management in large-scale language and vision-language models. Instead of relying on static uncertainty thresholds, we leverage reinforcement learning to dynamically set conformal prediction thresholds. This allows the model to decide when to output a single prediction, provide a set of plausible answers, or abstain altogether—depending on the uncertainty of the input.

This framework is especially beneficial in safety-critical applications where overconfident or miscalibrated predictions can have severe consequences.

Methodology

Adaptive Conformal Abstention: Traditional conformal prediction guarantees a fixed coverage (e.g., 90%) by using static quantile thresholds. However, this rigidity fails under variable task complexity and evolving data distributions.

We recast threshold selection as a reinforcement learning problem where a policy network learns to adjust CP thresholds—denoted by parameters (α and β)—based on performance feedback. This results in three regimes:

Single Prediction: When confidence is high.
Set Prediction: When uncertainty is moderate, returning multiple plausible answers.
Abstention: When uncertainty is too high to risk an incorrect answer.

The policy is trained using the REINFORCE algorithm to minimize a cost function that balances accuracy, prediction set size, and coverage.

Experiments

We evaluated our approach on a wide range of benchmarks for both LLMs and VLMs. Our evaluation metrics include:

Accuracy: Correctness of predictions.
Coverage: Ensuring the true answer is included in the output with a target of 90%.
Prediction Set Size: Reflecting the model's uncertainty.
AUROC: Measuring the ability to detect hallucinations.
AUARC: Evaluating uncertainty-guided selective generation.
Expected Calibration Error (ECE): Assessing how well the predicted confidence matches actual correctness.

Our approach outperforms static baselines—Least Ambiguous Classifiers (LAC) and Adaptive Prediction Sets (APS)—by:

Increasing accuracy by up to 3.2%.
Boosting AUROC for hallucination detection by up to 22.19%.
Enhancing AUARC for selective generation by up to 21.17%.
Reducing calibration error (ECE) by 70%-85%.
Maintaining consistent coverage at 90% while controlling prediction set sizes.

Results

Explore parts of the comprehensive results in our experiments.

Performance Comparison: AUROC & AUARC

Accuracy vs. Expected Calibration Error (ECE)

Performance Comparison: Accuracy & Set Size

Performance Comparison: Coverage Rates

Performance comparison of CAP (Ours), APS, and LAC on Llava-v1.6-34B (VLM) and Yi-34B (LLM) across four metrics: (i) accuracy, (ii) set size, (iii) AUROC, and (iv) AUARC.

Impact & Conclusions

Our adaptive conformal abstention policy (CAP) significantly advances uncertainty quantification and risk management in large-scale language and vision-language models. By dynamically adjusting the decision thresholds using reinforcement learning, CAP offers a flexible, context-aware approach that not only improves accuracy and calibration but also ensures reliable coverage.

This work paves the way for safer and more interpretable AI systems in high-stakes applications such as healthcare, autonomous systems, and legal decision support.

Resources

Code Repository: Github
Download Full Paper: PDF
Contact: For further discussion or inquiries, please email us.

BibTeX

@article{tayebati2025learning,
  title={Learning Conformal Abstention Policies for Adaptive Risk Management in Large Language and Vision-Language Models},
  author={Tayebati, Sina and Kumar, Divake and Darabi, Nastaran and Jayasuriya, Dinithi and Krishnan, Ranganath and Trivedi, Amit Ranjan},
  journal={arXiv preprint arXiv:2502.06884},
  year={2025}
}