INDEX
    Explanations

    words related to being surprised or not expecting a particular outcome

    New Auto-Interp
    Negative Logits
    kay
    -0.20
    ngth
    -0.20
    apt
    -0.19
    apeshifter
    -0.19
    obal
    -0.19
    hesion
    -0.19
    amins
    -0.19
    urai
    -0.18
    itiz
    -0.18
    ©¶æ
    -0.18
    POSITIVE LOGITS
    LER
    0.22
    stakes
    0.20
    swick
    0.19
    Sax
    0.19
     Rate
    0.19
    ATIONS
    0.18
     Dra
    0.18
    inged
    0.18
    LB
    0.18
    052
    0.18
    Act Density 12.268%

    No Known Activations