INDEX
    Explanations

    phrases that convey intense emotions or reactions, particularly relating to conflict or confrontation

    New Auto-Interp
    Negative Logits
     Nel
    -0.14
     Ferd
    -0.14
    ame
    -0.14
    à¸Ļà¸Ń
    -0.14
    beck
    -0.14
    stab
    -0.14
    unca
    -0.13
     McGr
    -0.13
     scratch
    -0.13
    iel
    -0.13
    POSITIVE LOGITS
    ancia
    0.17
     proverb
    0.17
    sock
    0.17
     socks
    0.17
    afone
    0.15
    éric
    0.15
     hell
    0.15
     heck
    0.15
     guts
    0.15
     ass
    0.15
    Act Density 0.140%

    No Known Activations