INDEX
Negative Logits
nou
-0.08
ANTE
-0.07
nest
-0.07
Stanton
-0.07
Emmanuel
-0.07
Сан
-0.07
Stanley
-0.07
Nelson
-0.07
Brent
-0.07
몽
-0.07
POSITIVE LOGITS
diff
0.16
Diff
0.15
diff
0.14
Diff
0.14
_diff
0.14
DIFF
0.11
DIFF
0.10
.diff
0.10
if
0.10
(diff
0.10
Activations Density 0.013%