INDEX
Explanations
descriptors of characters' likability and personality traits
New Auto-Interp
Negative Logits
addir
-0.18
stral
-0.16
nave
-0.16
FC
-0.16
urat
-0.15
.hs
-0.15
div
-0.14
attendance
-0.14
psilon
-0.14
defgroup
-0.14
POSITIVE LOGITS
McGregor
0.17
ioned
0.16
personality
0.16
ÑĢиз
0.16
ÏĨι
0.15
ãĥ¼ãĥij
0.15
æĨ
0.15
koc
0.15
æĺŃåĴĮ
0.15
berger
0.14
Activations Density 0.277%