INDEX
Explanations
phrases related to creating or owning something
phrases indicating possession or ownership related to personal creations
New Auto-Interp
Negative Logits
forth
-0.73
ĸļ
-0.69
teness
-0.65
ocument
-0.65
noon
-0.60
train
-0.60
âĢº
-0.59
thood
-0.58
hari
-0.58
Prev
-0.55
POSITIVE LOGITS
own
2.49
OWN
1.78
Own
1.77
own
1.18
selves
1.10
self
1.00
Own
0.89
desired
0.86
first
0.86
respective
0.85
Activations Density 0.157%