INDEX
Explanations
references to individuals in directorial or leadership positions
New Auto-Interp
Negative Logits
osci
-0.18
лÑİ
-0.17
ingly
-0.15
bish
-0.15
ICATION
-0.15
emit
-0.14
Ïģοι
-0.14
ocker
-0.14
oj
-0.14
uka
-0.14
POSITIVE LOGITS
ship
0.30
ial
0.30
ate
0.27
ivity
0.27
ships
0.24
ially
0.23
ates
0.22
orate
0.22
ium
0.20
-general
0.19
Activations Density 0.054%