INDEX
Explanations
pronouns and references to individuals or groups
negation and biological terms
New Auto-Interp
Negative Logits
виправивши
-0.46
vician
-0.43
lans
-0.36
centes
-0.36
fortunes
-0.36
CreateTagHelper
-0.35
liputi
-0.33
pherals
-0.33
patas
-0.32
schloss
-0.32
POSITIVE LOGITS
Италијани
0.55
PreferredItem
0.49
#
0.47
Walkover
0.47
TagMode
0.47
Derbyniad
0.46
Personensuche
0.46
RTGC
0.45
cyklopedia
0.45
RTLU
0.43
Activations Density 0.036%