INDEX
Explanations
references to geographic or governmental entities
New Auto-Interp
Negative Logits
kou
-0.17
ouce
-0.16
outil
-0.15
anzi
-0.15
ouncer
-0.15
undos
-0.15
abal
-0.15
abay
-0.15
itle
-0.15
lement
-0.14
POSITIVE LOGITS
ien
0.26
ussen
0.25
ij
0.25
ijd
0.21
egen
0.21
eg
0.20
eken
0.17
ewe
0.17
ient
0.17
ew
0.17
Activations Density 0.007%