INDEX
    Explanations

    words related to psychological or self-related concepts

    terms related to self-related conditions and actions

    New Auto-Interp
    Negative Logits
    sea
    -0.82
    nan
    -0.71
    ugu
    -0.71
    anwhile
    -0.68
    KEY
    -0.68
    hillary
    -0.66
    estone
    -0.66
    fml
    -0.66
    endez
    -0.65
    eday
    -0.64
    POSITIVE LOGITS
    itled
    0.69
     attribution
    0.68
    rency
    0.68
    essed
    0.67
     blame
    0.64
     gratification
    0.63
    rating
    0.61
     prophecy
    0.61
     exile
    0.61
    ihilation
    0.60
    Act Density 0.057%

    No Known Activations