INDEX
Explanations
phrases or words indicating uncertainty or speculation
phrases that suggest alternatives or comparisons
New Auto-Interp
Negative Logits
ires
-0.72
pire
-0.65
ETS
-0.63
urs
-0.62
edia
-0.62
efer
-0.62
hesive
-0.61
mitter
-0.60
ailable
-0.60
moil
-0.59
POSITIVE LOGITS
acle
1.29
acles
1.27
chard
1.21
nam
1.21
acular
1.18
ifice
1.15
chid
1.13
whatever
1.12
Else
1.11
whatever
1.07
Activations Density 0.185%