INDEX
Explanations
requests or instructions in a formal tone
New Auto-Interp
Negative Logits
ãĤ¦ãĤ¹
-0.80
pires
-0.79
ELD
-0.74
Orig
-0.69
MpServer
-0.68
cler
-0.68
ļé
-0.66
lot
-0.65
Pear
-0.63
arthed
-0.60
POSITIVE LOGITS
beware
1.61
consider
1.18
refrain
1.17
ignore
1.10
remember
1.09
heed
1.08
Ignore
1.08
ignore
1.08
note
1.06
caution
1.05
Activations Density 1.883%