INDEX
Explanations
references to Polish entities or cultural elements
New Auto-Interp
Negative Logits
icap
-0.18
urette
-0.16
leton
-0.15
iest
-0.15
opal
-0.14
ccount
-0.14
acular
-0.14
lä
-0.14
ane
-0.14
esda
-0.14
POSITIVE LOGITS
ongan
0.16
ynomials
0.15
ppo
0.15
atr
0.14
andr
0.14
appid
0.14
र
0.14
rose
0.14
eum
0.14
Barton
0.13
Activations Density 0.019%