INDEX
Explanations
mentions of various republics or countries
New Auto-Interp
Negative Logits
allo
-0.19
adge
-0.16
ings
-0.15
poz
-0.14
akeup
-0.14
CCI
-0.14
itage
-0.14
Hitch
-0.14
repeat
-0.13
leys
-0.13
POSITIVE LOGITS
anism
0.25
rats
0.23
ation
0.18
andom
0.17
اÛĮت
0.17
rat
0.16
wide
0.15
.defineProperty
0.15
ública
0.15
ans
0.15
Activations Density 0.014%