INDEX
    Explanations

    phrases that indicate a call to action or directives

    New Auto-Interp
    Negative Logits
    undles
    -0.16
    ypes
    -0.15
    utor
    -0.15
    rej
    -0.14
    dash
    -0.14
    abela
    -0.14
    eka
    -0.14
    okit
    -0.14
    ambi
    -0.14
    ily
    -0.14
    POSITIVE LOGITS
     attention
    0.26
     quits
    0.23
     duty
    0.23
    ibrate
    0.23
     forth
    0.22
    oused
    0.22
     dib
    0.21
     Attention
    0.21
     action
    0.20
     Duty
    0.20
    Act Density 0.048%

    No Known Activations