INDEX
    Explanations

    instances of strong emotional expressions or reactions

    New Auto-Interp
    Negative Logits
    assage
    -0.16
    loe
    -0.15
    ÄĻk
    -0.15
    intColor
    -0.14
    blr
    -0.14
    atement
    -0.14
    isses
    -0.14
    ãĥ¥ãĥ¼
    -0.14
    fragistics
    -0.14
    znám
    -0.13
    POSITIVE LOGITS
     there
    0.15
     Bilg
    0.14
    aden
    0.14
     hon
    0.14
     maybe
    0.14
     mere
    0.12
     Carolyn
    0.12
    yp
    0.12
     no
    0.12
     Bloss
    0.12
    Act Density 0.124%

    No Known Activations