Author here. Our paper “Calibrating LLM Confidence by Probing Perturbed
Representation Stability” was accepted to EMNLP 2025 Main Conference (top 15%) with a final rating of 9 (strong accept).
High-level summary:
We probe LLM hidden states with slight perturbations to check answer stability—stable implies confidence; unstable implies uncertainty. This lightweight method delivers >50% reductions in calibration error (down to ~4.5%) across LLaMA, Mistral, Qwen on MMLU & MMLU-Pro, with no LLM fine-tuning.
High-level summary: We probe LLM hidden states with slight perturbations to check answer stability—stable implies confidence; unstable implies uncertainty. This lightweight method delivers >50% reductions in calibration error (down to ~4.5%) across LLaMA, Mistral, Qwen on MMLU & MMLU-Pro, with no LLM fine-tuning.
Results, code, and dataset are available at: - Code: https://github.com/ledengary/CCPS - Data: https://huggingface.co/datasets/ledengary/CCPS
Happy to discuss technical details or calibration deployment strategies.