INDEX
Explanations
instances of comparison or conditionality
phrases that indicate conditions or situations involving alternatives or choices
New Auto-Interp
Negative Logits
ocracy
-0.70
Score
-0.63
then
-0.60
clusive
-0.60
est
-0.60
vered
-0.60
rous
-0.59
ETS
-0.59
onomy
-0.58
NOW
-0.58
POSITIVE LOGITS
else
1.09
otherwise
1.08
worse
1.06
chard
1.00
acles
0.98
alternatively
0.96
nam
0.95
ifice
0.90
acle
0.90
Else
0.90
Activations Density 0.123%