INDEX
Explanations
references to the second person pronoun "you."
New Auto-Interp
Negative Logits
lok
-0.15
atti
-0.15
wers
-0.14
amp
-0.14
rang
-0.13
رÙģ
-0.13
idious
-0.13
byt
-0.13
iative
-0.13
rou
-0.13
POSITIVE LOGITS
’re
0.21
SELF
0.19
're
0.17
-même
0.17
’ll
0.16
indsight
0.16
’ve
0.16
zelf
0.16
/us
0.16
OSE
0.15
Activations Density 0.467%