INDEX
Explanations
punctuation marks
instances of the word "but."
New Auto-Interp
Negative Logits
SourceFile
-0.73
interstitial
-0.68
orthy
-0.65
ACE
-0.63
ords
-0.62
ãĤ¶
-0.62
orses
-0.61
olves
-0.61
clave
-0.61
coverage
-0.60
POSITIVE LOGITS
alas
1.17
uh
0.93
yeah
0.90
interestingly
0.89
unsurprisingly
0.83
unlike
0.82
unfortunately
0.80
secondly
0.80
yes
0.78
needless
0.75
Activations Density 0.087%