INDEX
    Explanations

    words related to mental health conditions and experiences such as dissociation, identity confusion, and memory alteration

    New Auto-Interp
    Negative Logits
    inery
    -0.78
    ©¶æ¥µ
    -0.74
    abba
    -0.71
     pione
    -0.68
    abor
    -0.67
    iHUD
    -0.65
    20439
    -0.64
    alde
    -0.64
    ullivan
    -0.63
    Ãį
    -0.63
    POSITIVE LOGITS
    ?
    1.99
    ?:
    1.89
    ?'
    1.86
    ?"
    1.86
    ?)
    1.79
    ?),
    1.76
    ?",
    1.75
    ?).
    1.74
    ?!
    1.72
    ?".
    1.69
    Act Density 1.748%

    No Known Activations