INDEX
Explanations
expressions of confidence and self-esteem
New Auto-Interp
Negative Logits
Ber
-0.70
gdx
-0.69
asley
-0.66
к
-0.64
Brink
-0.62
sel
-0.62
ollectionView
-0.61
ber
-0.60
tark
-0.60
чан
-0.60
POSITIVE LOGITS
confidence
1.74
confidence
1.65
Confidence
1.60
confident
1.53
Confidence
1.46
confident
1.43
confidently
1.31
itſelf
1.29
myſelf
1.25
confiance
1.19
Activations Density 0.061%