INDEX
Explanations
references to responsibilities and interpersonal relations in various contexts
New Auto-Interp
Negative Logits
nakalista
-1.01
kasarigan
-0.76
HideFlags
-0.73
########.
-0.71
سكانية
-0.69
Taktlose
-0.67
Signalez
-0.66
-0.63
Билгалдахарш
-0.62
RegressionTest
-0.60
POSITIVE LOGITS
others
2.26
Others
1.97
others
1.94
Others
1.89
OTHERS
1.73
دیگران
1.32
别人
1.22
other
1.21
someone
1.18
他人
1.13
Activations Density 0.285%