INDEX
Explanations
references to specific geographic locations or place names
New Auto-Interp
Negative Logits
chein
-0.17
bih
-0.17
iye
-0.16
Frog
-0.15
leo
-0.15
iros
-0.15
orio
-0.15
chten
-0.15
appare
-0.15
shm
-0.15
POSITIVE LOGITS
ames
0.29
emens
0.26
enna
0.25
erra
0.24
oux
0.23
empre
0.22
esta
0.22
ERR
0.21
erral
0.19
err
0.19
Activations Density 0.006%