INDEX
Explanations
references to Polish culture or identity
New Auto-Interp
Negative Logits
oyer
-0.17
lain
-0.16
Ñıб
-0.15
lä
-0.15
aways
-0.15
periodic
-0.15
Kidd
-0.15
y
-0.15
ellers
-0.14
een
-0.14
POSITIVE LOGITS
lock
0.24
ych
0.23
aris
0.23
onium
0.22
yc
0.22
ynomials
0.21
ka
0.21
ishing
0.21
anco
0.21
ishes
0.20
Activations Density 0.012%