INDEX
Explanations
words related to extroversion and introversion
New Auto-Interp
Negative Logits
Nay
-0.17
arLayout
-0.16
hev
-0.16
uluk
-0.16
hti
-0.15
EXP
-0.15
utan
-0.15
Rip
-0.14
arias
-0.14
utf
-0.14
POSITIVE LOGITS
ensive
0.36
remely
0.34
rem
0.33
reme
0.30
inction
0.29
ending
0.28
ortion
0.27
ensively
0.27
ensible
0.26
rema
0.25
Activations Density 0.009%