INDEX
    Explanations

    words related to emotion or feelings

    New Auto-Interp
    Negative Logits
     dro
    -0.15
    amat
    -0.15
    ëŀĢ
    -0.14
    az
    -0.14
     Dob
    -0.14
    stor
    -0.14
     chim
    -0.14
    Fab
    -0.14
     ure
    -0.13
    .n
    -0.13
    POSITIVE LOGITS
    ppard
    0.18
    PÅĻed
    0.17
    esktop
    0.16
    isclosed
    0.16
    éĺħ读次æķ°
    0.15
    uitka
    0.14
    ummy
    0.14
    reesome
    0.14
    evice
    0.14
    ãĤ
    0.14
    Act Density 0.127%

    No Known Activations