INDEX
Explanations
references to confidence and self-esteem
New Auto-Interp
Negative Logits
Ber
-0.71
gdx
-0.69
к
-0.64
м
-0.63
asley
-0.62
sel
-0.62
berk
-0.61
tark
-0.60
ber
-0.60
Brink
-0.60
POSITIVE LOGITS
confidence
1.74
confidence
1.65
Confidence
1.64
confident
1.56
Confidence
1.47
confident
1.46
confidently
1.35
myſelf
1.28
confiance
1.22
itſelf
1.21
Activations Density 0.056%