INDEX
Explanations
pronouns related to possession
possessive pronouns, specifically "their"
New Auto-Interp
Negative Logits
Bear
-0.74
scape
-0.74
Unsure
-0.73
zz
-0.72
FP
-0.70
Thompson
-0.69
CCC
-0.68
Flan
-0.67
Ïĥ
-0.67
eers
-0.66
POSITIVE LOGITS
own
1.52
respective
1.48
selves
1.22
selves
1.11
predecessors
1.05
namesake
1.01
self
1.01
successors
0.97
counterparts
0.97
newfound
0.97
Activations Density 0.207%