INDEX
Explanations
references to significant historical figures or events
New Auto-Interp
Negative Logits
VÄĽ
-0.16
edii
-0.15
elters
-0.15
Trab
-0.15
eskort
-0.15
-ci
-0.14
Ukrainian
-0.14
995
-0.14
ville
-0.14
ulg
-0.14
POSITIVE LOGITS
Slovenia
0.19
Sloven
0.19
.si
0.17
Rav
0.17
lj
0.15
acija
0.15
asje
0.15
loven
0.15
isce
0.14
erken
0.14
Activations Density 0.033%