INDEX
Explanations
commands or requests to stop or take action
commands or directives to stop or change actions
New Auto-Interp
Negative Logits
NetMessage
-0.70
sqor
-0.67
touted
-0.66
cember
-0.66
etimes
-0.65
Flavoring
-0.64
eor
-0.64
hinted
-0.63
seams
-0.63
aceutical
-0.63
POSITIVE LOGITS
ASAP
0.81
igate
0.77
urgently
0.77
escription
0.72
uate
0.71
lift
0.70
itate
0.69
plane
0.68
dress
0.67
urgent
0.64
Activations Density 0.285%