INDEX
Explanations
phrases indicating uncertainty or doubt
uncertainty about choices or information
New Auto-Interp
Negative Logits
surla
-0.48
encar
-0.45
BoxFit
-0.41
adita
-0.36
sibling
-0.36
menudo
-0.35
miştir
-0.35
laid
-0.35
frutos
-0.34
snatched
-0.34
POSITIVE LOGITS
unsure
0.81
Uncertain
0.69
Uncertainty
0.68
uncertain
0.68
Uncertainty
0.67
uncertainty
0.64
uncertainty
0.62
uncertainties
0.58
dunno
0.57
我不知道
0.56
Activations Density 0.014%