INDEX
Explanations
names of people and organizations
New Auto-Interp
Negative Logits
ereg
-0.16
olls
-0.16
$($
-0.16
imens
-0.15
agens
-0.15
agua
-0.15
oller
-0.15
roker
-0.14
ollectors
-0.14
Äįen
-0.14
POSITIVE LOGITS
oslav
0.20
elyn
0.19
islav
0.17
fred
0.16
bert
0.16
elda
0.16
ÑĢÑĥн
0.16
енка
0.15
fried
0.15
inda
0.15
Activations Density 0.674%