INDEX
Negative Logits
behaved
-0.06
Manip
-0.06
ding
-0.06
FN
-0.06
Hat
-0.06
MQ
-0.06
Hat
-0.06
_group
-0.06
Talk
-0.06
произош
-0.06
POSITIVE LOGITS
awakeFromNib
0.07
imposs
0.06
стров
0.06
/cpp
0.06
.priority
0.06
leo
0.06
tolerant
0.06
заним
0.06
pró
0.06
agent
0.06
Activations Density 0.004%