INDEX
Explanations
words related to ownership or possession
occurrences of the word "our"
New Auto-Interp
Negative Logits
conom
-0.84
icter
-0.79
puff
-0.79
endra
-0.75
fect
-0.71
hift
-0.70
cum
-0.69
bender
-0.69
ppings
-0.68
wrap
-0.68
POSITIVE LOGITS
selves
1.24
own
1.05
beloved
0.98
ourselves
0.95
dear
0.88
asses
0.87
ancestors
0.86
esteemed
0.85
hearts
0.85
selves
0.84
Activations Density 0.144%