INDEX
Explanations
references to geographic locations and administrative divisions
New Auto-Interp
Negative Logits
istrovstvÃŃ
-0.17
ifar
-0.16
swick
-0.16
drv
-0.16
conj
-0.16
baÅŁ
-0.15
conj
-0.15
heel
-0.14
pij
-0.14
Borders
-0.14
POSITIVE LOGITS
Sul
0.20
Wang
0.20
Egg
0.19
Staff
0.18
Ob
0.18
Unt
0.18
Wild
0.17
Spielberg
0.17
icken
0.17
Hard
0.17
Activations Density 0.015%