INDEX
Explanations
pronouns followed by possessive pronouns
pronouns in plural or possessive forms
New Auto-Interp
Negative Logits
ĸļ
-0.68
fill
-0.67
Walters
-0.66
Cu
-0.66
Grey
-0.65
usa
-0.65
mire
-0.65
vine
-0.64
Sn
-0.64
parser
-0.64
POSITIVE LOGITS
own
1.71
entire
1.11
reputation
1.01
fortunes
1.00
selves
0.97
allegiance
0.96
rightful
0.94
knees
0.93
arms
0.92
vision
0.91
Activations Density 0.244%