INDEX
Explanations
references to government officials or leaders, specifically prime ministers
New Auto-Interp
Negative Logits
ODEV
-0.15
ÑĢеÑī
-0.15
mans
-0.14
oundary
-0.14
room
-0.14
(es
-0.14
vron
-0.14
ACTER
-0.13
ãģ£ãģ¡
-0.13
prising
-0.13
POSITIVE LOGITS
Minister
0.33
minister
0.31
-min
0.23
ministers
0.22
Ministers
0.21
min
0.20
mover
0.20
movers
0.20
_Min
0.19
_min
0.19
Activations Density 0.007%