INDEX
Explanations
negative expressions or prohibitions
New Auto-Interp
Negative Logits
typelib
-0.60
pokrač
-0.56
nakalista
-0.56
OGND
-0.54
beschik
-0.52
acted
-0.51
كومونز
-0.51
ComVisible
-0.50
vznik
-0.50
invested
-0.49
POSITIVE LOGITS
vastaan
0.40
quiao
0.40
riwal
0.38
sponsoring
0.37
хьтан
0.37
meille
0.36
ől
0.36
phology
0.35
herjee
0.35
iolis
0.34
Activations Density 0.086%