INDEX
Explanations
personal pronouns followed by actions
instances of the pronoun "you" and related personal references
New Auto-Interp
Negative Logits
è£ıç
-0.90
è£ıè
-0.82
Additional
-0.72
âģ
-0.72
Mehran
-0.71
ollower
-0.67
Effective
-0.66
Luxem
-0.63
ä¹ĭ
-0.63
è£ı
-0.63
POSITIVE LOGITS
kinda
1.48
gotta
1.42
dunno
1.37
wanna
1.27
definitely
1.21
REALLY
1.16
probably
1.10
've
1.08
'll
1.08
'd
1.07
Activations Density 0.382%