Calculating Percentage-Based Confidence from Similarities of Embedding Models

1 serengil 1 9/4/2025, 4:48:40 PM sefiks.com ↗

Comments (1)

serengil · 14h ago
We use cosine or Euclidean distances for embedding models to make hard classifications. But this has a big limitation: no measure of confidence and no interpretability.

Instead, building a logistic regression model can turn distances into percentage based confidence scores. This also accounted for how a small decrease in distance affects the confidence score—similar to how a derivative measures sensitivity.