INDEX
Explanations
personalized messages indicating something specific is meant for the reader
references to the second person, particularly the word "you" in various contexts
New Auto-Interp
Negative Logits
ĸļ
-0.82
ipal
-0.80
entimes
-0.77
aughed
-0.73
ice
-0.67
apo
-0.66
ariat
-0.63
pite
-0.62
urtle
-0.62
oche
-0.61
POSITIVE LOGITS
guys
1.44
tub
1.28
RS
1.15
're
1.08
Tube
0.89
NG
0.86
yourselves
0.83
sir
0.81
yourself
0.81
hei
0.79
Activations Density 0.109%