INDEX
Explanations
mentions of possibilities or potential outcomes
New Auto-Interp
Negative Logits
core
-0.74
teen
-0.65
Balanced
-0.62
honors
-0.61
cloth
-0.61
furt
-0.60
ging
-0.59
honoring
-0.59
Yards
-0.59
Duty
-0.58
POSITIVE LOGITS
feas
1.39
conce
1.27
be
1.08
possibly
1.07
theoretically
1.05
afford
1.04
potentially
1.04
hypot
1.03
ivably
1.03
easily
1.01
Activations Density 0.303%