INDEX
Explanations
proper nouns, particularly names and organizations
New Auto-Interp
Negative Logits
istros
-0.16
uko
-0.14
prend
-0.14
utin
-0.13
stakes
-0.13
UBL
-0.13
chn
-0.13
intree
-0.13
onde
-0.13
Kab
-0.13
POSITIVE LOGITS
deen
0.15
ÙĬات
0.14
referer
0.14
WEEN
0.14
ĭ
0.13
apart
0.13
ittance
0.13
ussen
0.13
ceived
0.13
replic
0.13
Activations Density 0.054%