INDEX
Explanations
actions involving conflict or confrontation
New Auto-Interp
Negative Logits
चीज़ों
-0.76
-0.64
Wikimedijinoj
-0.62
Portale
-0.59
الدراسه
-0.58
الحياه
-0.57
незавершена
-0.57
Географиясе
-0.55
للمعارف
-0.54
Normdatei
-0.54
POSITIVE LOGITS
entera
0.36
limpi
0.35
quickly
0.35
Öff
0.34
เอ
0.34
gently
0.34
slowly
0.34
yeter
0.33
rodea
0.33
cintas
0.33
Activations Density 0.416%