INDEX
Explanations
verbs related to decision making
New Auto-Interp
Negative Logits
furt
-0.77
76561
-0.67
quartered
-0.62
teen
-0.60
cursing
-0.59
Falling
-0.58
estate
-0.57
pend
-0.56
backer
-0.56
Shooting
-0.56
POSITIVE LOGITS
't
1.25
afford
1.24
safely
1.13
overcome
1.05
feas
1.02
improve
1.02
withstand
1.00
berra
0.98
replicate
0.98
achieve
0.98
Activations Density 0.165%