INDEX
Explanations
references to individuals and groups in various contexts
New Auto-Interp
Negative Logits
zano
-0.16
="../../../
-0.15
apesh
-0.15
Courtesy
-0.15
erson
-0.15
roma
-0.14
aeper
-0.14
éij
-0.13
esser
-0.13
tas
-0.13
POSITIVE LOGITS
own
0.32
ability
0.30
abilities
0.27
main
0.27
presence
0.25
contribution
0.25
efforts
0.24
role
0.24
biggest
0.24
latest
0.23
Activations Density 0.535%