INDEX
Explanations
phrases indicating alternative choices or options
conditional phrases or statements
New Auto-Interp
Negative Logits
ires
-0.72
ursday
-0.70
onday
-0.67
uesday
-0.67
emi
-0.66
IDENT
-0.64
orter
-0.64
ETS
-0.64
estro
-0.64
erest
-0.63
POSITIVE LOGITS
chard
1.21
alternatively
1.09
acle
1.06
whatever
1.05
nam
1.04
chid
0.98
nery
0.98
ifice
0.97
acles
0.96
otherwise
0.95
Activations Density 0.093%