INDEX
    Explanations

    references to psychological disorders and their associated characteristics

    New Auto-Interp
    Negative Logits
     un
    -0.77
     pu
    -0.75
    ,
    -0.74
    .
    -0.72
     no
    -0.70
     to
    -0.68
     ha
    -0.68
     in
    -0.68
     bu
    -0.67
     so
    -0.65
    POSITIVE LOGITS
     Monfieur
    1.64
     Jefus
    1.59
     myſelf
    1.57
     Efq
    1.51
     Houſe
    1.49
     itſelf
    1.49
     Majefty
    1.48
    ſelf
    1.44
     himſelf
    1.43
     Eſ
    1.41
    Act Density 0.828%

    No Known Activations