INDEX
Explanations
phrases related to comparisons
instances of lists or enumerations in descriptions
New Auto-Interp
Negative Logits
%:
-0.82
escription
-0.75
hig
-0.74
gat
-0.73
resents
-0.70
edom
-0.68
icipated
-0.68
:[
-0.68
sylv
-0.66
roup
-0.66
POSITIVE LOGITS
incidentally
1.41
alas
1.40
anyway
1.33
admittedly
1.21
eh
1.11
yeah
1.07
yes
1.06
huh
1.04
anyways
1.03
presumably
1.01
Activations Density 0.217%