INDEX
Explanations
personal pronouns and possessive pronouns indicating interaction or involvement with others
pronouns and references to people in the context of relationships and interactions
New Auto-Interp
Negative Logits
gee
-0.63
dent
-0.62
sequence
-0.61
ĸļ
-0.60
ente
-0.60
ctory
-0.59
multipl
-0.58
ICE
-0.58
semb
-0.58
Comment
-0.58
POSITIVE LOGITS
selves
1.02
atically
0.80
atic
0.78
self
0.73
sorely
0.70
overe
0.69
loose
0.68
ãģ¦
0.67
ben
0.65
owsky
0.65
Activations Density 0.151%