INDEX
Explanations
the first-person singular pronoun "I."
New Auto-Interp
Negative Logits
leton
-0.19
Others
-0.14
_gift
-0.14
lero
-0.14
llen
-0.13
cret
-0.13
gger
-0.13
isay
-0.13
mbH
-0.13
rary
-0.13
POSITIVE LOGITS
fuck
0.17
opo
0.17
ãģĬãĤĬ
0.15
nga
0.15
<decltype
0.15
avatars
0.15
ux
0.14
whereas
0.14
ucch
0.14
fuck
0.14
Activations Density 0.000%