INDEX
Explanations
personal pronouns, specifically 'You'
instances of the pronoun "You"
New Auto-Interp
Negative Logits
shore
-0.69
¿½
-0.66
Gamb
-0.65
airs
-0.62
ãĥ³ãĤ¸
-0.60
stemming
-0.60
wrapper
-0.59
assemb
-0.58
ipal
-0.58
itud
-0.58
POSITIVE LOGITS
're
1.41
've
1.23
'll
1.22
guys
1.08
guessed
1.07
'd
1.02
tub
0.99
ngth
0.96
imar
0.91
gotta
0.90
Activations Density 0.125%