INDEX
Explanations
phrases related to possession or association
possessive pronouns or descriptors indicating ownership
New Auto-Interp
Negative Logits
finger
-0.68
verb
-0.67
atoes
-0.65
landers
-0.64
ominated
-0.62
=\"
-0.61
etheless
-0.61
Uriel
-0.59
igh
-0.59
Schr
-0.58
POSITIVE LOGITS
entirety
1.53
stead
1.33
infancy
1.22
haste
1.11
absence
1.04
wake
0.98
totality
0.96
spare
0.94
own
0.93
stride
0.91
Activations Density 0.093%