INDEX
Explanations
phrases related to taking action or making decisions
phrases that indicate actions or instructions related to personal engagement or communication
New Auto-Interp
Negative Logits
Contin
-0.74
AX
-0.71
cium
-0.68
ãĥĹ
-0.65
iband
-0.62
oids
-0.61
enium
-0.61
iber
-0.60
GO
-0.60
ibu
-0.60
POSITIVE LOGITS
theirs
1.85
hers
1.64
yours
1.59
ours
1.55
mine
1.40
your
1.26
my
1.22
his
1.19
your
1.18
YOUR
1.16
Activations Density 0.405%