INDEX
    Explanations

    negative prefixes or terms related to being unwell or unhappy

    New Auto-Interp
    Negative Logits
    jee
    -0.15
    nice
    -0.15
    emer
    -0.15
    ollections
    -0.15
    erna
    -0.14
    dbname
    -0.14
    unity
    -0.14
    ãģĹãģ¦ãĤĤ
    -0.14
    activity
    -0.14
    ernes
    -0.14
    POSITIVE LOGITS
    ashed
    0.19
    /un
    0.18
    atable
    0.18
    oppable
    0.17
     hÆ°á»Łng
    0.17
    uguay
    0.16
    MBER
    0.16
    strained
    0.16
    atron
    0.16
     Hanson
    0.15
    Act Density 0.016%

    No Known Activations