INDEX
Explanations
statements related to acceptance or approval
New Auto-Interp
Negative Logits
phabet
-0.76
APH
-0.75
Ranked
-0.72
ammy
-0.72
########
-0.71
loo
-0.70
Tycoon
-0.69
urst
-0.69
eways
-0.69
obiles
-0.68
POSITIVE LOGITS
accepting
0.99
ably
0.90
accept
0.88
accepts
0.84
admit
0.81
acceptance
0.81
uncond
0.79
admitting
0.78
willingly
0.77
defeat
0.75
Activations Density 2.151%