INDEX
    Explanations

    mentions of mental health and conditions

    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.00
    2:0.11
    3:0.41
    4:0.05
    5:0.02
    6:0.03
    7:0.04
    8:0.08
    9:0.07
    10:0.07
    11:0.05
    Negative Logits
    —"
    -2.00
    ),"
    -1.79
    …"
    -1.64
     Advent
    -1.56
    ë
    -1.53
     Martial
    -1.52
    !),
    -1.51
     darling
    -1.49
    ..."
    -1.49
    !"
    -1.49
    POSITIVE LOGITS
     chars
    1.76
    SW
    1.75
    rimp
    1.59
     **
    1.54
    lishes
    1.52
    kit
    1.51
     captcha
    1.50
    imeo
    1.50
    verified
    1.48
     STEP
    1.47
    Act Density 0.007%

    No Known Activations