INDEX
Explanations
probability of null or rejection
New Auto-Interp
Negative Logits
sever
-0.11
Heller
-0.10
fi
-0.09
_vlog
-0.09
fug
-0.09
sub
-0.09
fro
-0.09
blow
-0.09
Moor
-0.09
incom
-0.08
POSITIVE LOGITS
null
0.21
Null
0.20
Null
0.19
null
0.18
Reject
0.17
Reject
0.17
reject
0.16
_null
0.16
rejecting
0.15
rejection
0.15
Activations Density 0.010%