INDEX
Explanations
expressions of pride and accomplishment
New Auto-Interp
Negative Logits
DECL
-0.17
_PHP
-0.16
UGH
-0.15
reesome
-0.15
iro
-0.14
ãģıãĤĵ
-0.14
boo
-0.14
dsl
-0.14
Fare
-0.14
utherford
-0.14
POSITIVE LOGITS
pride
0.81
proud
0.80
Proud
0.67
Pride
0.67
proudly
0.52
brag
0.38
boast
0.35
stol
0.31
boasted
0.30
boasting
0.28
Activations Density 0.129%