INDEX
    Explanations

    words associated with emotional expressions

    New Auto-Interp
    Negative Logits
    ney
    -0.17
    nya
    -0.16
    isset
    -0.16
    iti
    -0.16
    maf
    -0.16
    neys
    -0.16
    iet
    -0.15
    sel
    -0.15
    lor
    -0.14
    lu
    -0.14
    POSITIVE LOGITS
    ek
    0.29
    eting
    0.25
    ering
    0.25
    eming
    0.24
    eking
    0.24
    ez
    0.24
    evil
    0.23
    eper
    0.23
    enie
    0.23
    eks
    0.23
    Act Density 0.087%

    No Known Activations