INDEX
Explanations
possessive pronouns followed by words related to personal belongings or actions
possessive pronouns indicating ownership or association
New Auto-Interp
Negative Logits
Ambro
-0.63
021
-0.57
Unsure
-0.53
$$$$
-0.49
Links
-0.49
haus
-0.48
046
-0.47
191
-0.47
TBD
-0.46
Jol
-0.46
POSITIVE LOGITS
own
1.09
self
0.70
deepest
0.69
respective
0.69
tremend
0.68
rightful
0.64
newfound
0.63
beloved
0.62
selves
0.62
favorite
0.62
Activations Density 0.372%