INDEX
    Explanations

    various forms of the word "order" and references to compliance or disobedience related to commands or instructions

    New Auto-Interp
    Negative Logits
    sidemargin
    -0.73
    󠁧
    -0.68
    发表于
    -0.67
    tvguidetime
    -0.67
    addContainerGap
    -0.64
    multer
    -0.64
     ostavi
    -0.63
    存于互联网档案馆
    -0.63
     enriquec
    -0.62
    sharing
    -0.61
    POSITIVE LOGITS
     instructions
    1.15
     orders
    1.02
     commands
    1.00
     Instructions
    1.00
     directives
    0.98
     INSTRUCTIONS
    0.96
    指令
    0.95
     Auftrag
    0.94
     directive
    0.91
     commanded
    0.90
    Act Density 0.255%

    No Known Activations