INDEX
Explanations
titles and roles associated with leadership positions
New Auto-Interp
Negative Logits
sik
-0.16
enga
-0.15
ManagerInterface
-0.14
-purpose
-0.14
Ñıл
-0.14
rish
-0.14
Ĭ
-0.13
ander
-0.13
aries
-0.13
ive
-0.13
POSITIVE LOGITS
person
0.28
woman
0.20
persons
0.18
../../../
0.17
manship
0.17
lift
0.16
urette
0.16
lotte
0.16
783
0.16
rig
0.15
Activations Density 0.018%