INDEX
    Explanations

    references to mental health issues, particularly depression and anxiety

    New Auto-Interp
    Negative Logits
    anning
    -0.17
    åĪ©
    -0.16
    osy
    -0.15
    адж
    -0.14
    ãĥ³ãĤ¹
    -0.14
    tparam
    -0.14
    ãĤ¼
    -0.14
    ãĥ¬ãĥĥãĥĪ
    -0.14
    κι
    -0.14
    asant
    -0.14
    POSITIVE LOGITS
     mood
    0.18
     Sad
    0.16
     Depression
    0.15
     depressive
    0.15
    ductive
    0.15
     Mood
    0.15
    /an
    0.14
     Factory
    0.14
     antidepress
    0.14
    é¡Ķ
    0.14
    Act Density 0.074%

    No Known Activations