INDEX
Explanations
pronouns referring to possession or belonging
possessive pronouns and related possessive terms
New Auto-Interp
Negative Logits
ominated
-0.73
ifact
-0.72
ographers
-0.66
uph
-0.65
otin
-0.64
otation
-0.64
yip
-0.64
verb
-0.64
ulas
-0.63
daq
-0.62
POSITIVE LOGITS
midst
1.27
own
1.20
stead
1.20
entirety
1.18
vicinity
1.14
guise
1.03
stride
1.02
infancy
1.00
footsteps
0.98
wake
0.95
Activations Density 0.135%