INDEX
Explanations
instances of the word "you" in various forms and contexts
New Auto-Interp
Negative Logits
atti
-0.16
itself
-0.16
nut
-0.15
andon
-0.15
store
-0.14
usted
-0.14
member
-0.14
foot
-0.14
line
-0.13
nb
-0.13
POSITIVE LOGITS
’re
0.25
’ll
0.24
'll
0.24
're
0.22
’ve
0.21
-même
0.21
've
0.21
nger
0.20
’d
0.20
'd
0.19
Activations Density 0.463%