INDEX
Explanations
references to organizational structures and roles within institutions
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.15
–
-0.15
=
-0.15
yonel
-0.14
ères
-0.14
–↵↵
-0.14
asic
-0.14
istrovstvÃŃ
-0.13
(
-0.13
ifornia
-0.13
POSITIVE LOGITS
&
0.24
"
0.24
>
0.21
é
0.20
&
0.19
lessness
0.19
ó
0.19
(.
0.18
(;
0.18
&a
0.18
Activations Density 0.011%