INDEX
Explanations
personal pronouns indicating a sense of belonging or involvement in a group or situation
New Auto-Interp
Negative Logits
orously
-0.82
ĸļ
-0.76
cru
-0.70
ready
-0.68
plin
-0.65
charg
-0.65
itialized
-0.64
amaru
-0.64
usp
-0.63
fried
-0.63
POSITIVE LOGITS
sake
1.18
purposes
1.03
personally
1.02
selves
0.92
reasons
0.89
liking
0.85
self
0.84
guys
0.79
selves
0.76
ummies
0.70
Activations Density 0.073%