INDEX
Explanations
personal pronouns indicating possession
New Auto-Interp
Negative Logits
wrap
-0.83
yang
-0.79
pher
-0.75
avis
-0.73
rov
-0.73
bender
-0.72
etz
-0.71
inctions
-0.70
olkien
-0.70
daq
-0.70
POSITIVE LOGITS
own
1.70
favorite
1.43
favourite
1.36
selves
1.18
beloved
1.15
ancestors
1.15
surroundings
1.13
friends
1.11
hometown
1.11
thoughts
1.10
Activations Density 1.169%