INDEX
Explanations
phrases prompting reader interaction or engagement
frequent occurrences of the word "you."
New Auto-Interp
Negative Logits
ruciating
-0.79
ospital
-0.71
apo
-0.70
uces
-0.69
images
-0.68
icent
-0.68
uction
-0.68
ape
-0.66
Pg
-0.66
Lago
-0.65
POSITIVE LOGITS
guys
1.11
're
0.99
tub
0.96
know
0.86
prefer
0.86
subscribed
0.85
wanna
0.85
want
0.84
intend
0.84
've
0.84
Activations Density 0.054%