INDEX
Negative Logits
disturbances
-0.07
-Man
-0.07
Man
-0.07
Running
-0.07
_running
-0.07
Running
-0.07
-kind
-0.07
disturbance
-0.07
Alive
-0.07
extent
-0.07
POSITIVE LOGITS
拒
0.17
refusing
0.14
.reject
0.14
refused
0.14
rejects
0.14
رفض
0.14
Rejected
0.14
rejected
0.14
refusal
0.14
reject
0.14
Activations Density 0.045%