INDEX
    Explanations

    negative expressions and descriptions of discomfort or adversity

    New Auto-Interp
    Negative Logits
    ignon
    -0.16
    utton
    -0.16
    UTTON
    -0.16
     Dent
    -0.15
    fav
    -0.14
    Dialogue
    -0.14
     Ire
    -0.13
    ìłľ
    -0.13
    Dialog
    -0.13
    ets
    -0.13
    POSITIVE LOGITS
    hack
    0.17
     jean
    0.16
     hack
    0.16
    kola
    0.15
    ivid
    0.15
     hacks
    0.15
    ECTOR
    0.14
     Struct
    0.14
    aces
    0.14
    EO
    0.14
    Act Density 1.474%

    No Known Activations