INDEX
Explanations
phrases expressing disappointment, disapproval, or outrage
expressions of disappointment or outrage
New Auto-Interp
Negative Logits
aukee
-0.88
exting
-0.83
pione
-0.75
ById
-0.75
incial
-0.70
ounding
-0.69
ă
-0.69
ãĤ¡
-0.69
đ
-0.69
RandomRedditor
-0.68
POSITIVE LOGITS
someone
1.11
somebody
1.02
anyone
0.95
nobody
0.94
so
0.90
such
0.85
anybody
0.84
they
0.79
people
0.76
we
0.75
Activations Density 0.131%