INDEX
Explanations
instances of punctuation and emphasis in text
New Auto-Interp
Negative Logits
Č
-0.17
ayscale
-0.15
uby
-0.14
olygon
-0.14
Butt
-0.13
lying
-0.13
ugg
-0.13
견
-0.13
PostBack
-0.13
terior
-0.13
POSITIVE LOGITS
##
0.22
č↵č↵č↵
0.20
####
0.19
###
0.19
----------↵↵
0.17
---↵↵
0.17
↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
0.16
######
0.15
rough
0.14
esses
0.14
Activations Density 0.245%