INDEX
    Explanations

    people's names or pronouns referring to specific individuals

    references to individuals or entities defined by the pronoun "whom."

    New Auto-Interp
    Negative Logits
    trap
    -0.70
    termin
    -0.69
    artifacts
    -0.65
    repre
    -0.64
    starting
    -0.64
    0100
    -0.60
     inhibitor
    -0.59
    hig
    -0.59
     pend
    -0.59
    pillar
    -0.59
    POSITIVE LOGITS
    soever
    2.13
     she
    0.91
     he
    0.90
     we
    0.89
     they
    0.85
     thou
    0.80
     critics
    0.80
     Vanity
    0.77
     you
    0.74
     Chomsky
    0.72
    Act Density 0.026%

    No Known Activations