INDEX
    Explanations

    expressions of strong affirmation or emphasis

    New Auto-Interp
    Negative Logits
     serem
    -0.89
    -0.86
    themselves
    -0.85
     themselves
    -0.81
     terem
    -0.79
     yourselves
    -0.63
    OGND
    -0.60
    herself
    -0.58
     numberOfRows
    -0.55
    rxjs
    -0.55
    POSITIVE LOGITS
     have
    1.15
     can
    1.11
     don
    1.10
     cannot
    1.07
     am
    1.06
     want
    1.02
     think
    0.99
     believe
    0.96
     feel
    0.96
     wish
    0.92
    Act Density 0.062%

    No Known Activations