INDEX
Explanations
geographic locations and references to specific places
New Auto-Interp
Negative Logits
cher
-0.16
obili
-0.15
somehow
-0.14
inha
-0.14
agher
-0.14
indr
-0.14
iali
-0.14
atron
-0.14
chas
-0.13
adam
-0.13
POSITIVE LOGITS
едини
0.15
ħn
0.14
inton
0.14
serter
0.14
eters
0.14
ToLeft
0.14
280
0.14
feminine
0.14
OnInit
0.14
shire
0.13
Activations Density 0.068%