INDEX
Explanations
instances of the word "you" and its variations, indicating a focus on direct address or engagement with the reader
New Auto-Interp
Negative Logits
are
-0.20
leston
-0.19
estão
-0.16
EITHER
-0.16
ÑİÑĤ
-0.16
DON
-0.15
mani
-0.15
имеÑİÑĤ
-0.14
plash
-0.14
either
-0.14
POSITIVE LOGITS
guys
0.26
ever
0.22
suppose
0.19
remember
0.19
remembers
0.18
sometimes
0.17
perch
0.17
ever
0.17
Guys
0.17
Remember
0.17
Activations Density 0.074%