INDEX
    Explanations

    personal pronouns followed by actions or qualities

    expressions of collective human experience or actions

    New Auto-Interp
    Negative Logits
     externalToEVAOnly
    -0.66
     Publication
    -0.61
    URI
    -0.61
     srfAttach
    -0.61
    REDACTED
    -0.60
     Saud
    -0.60
    RECT
    -0.58
     Nex
    -0.57
    fect
    -0.57
     Amar
    -0.56
    POSITIVE LOGITS
    've
    0.95
    're
    0.95
    akening
    0.92
    arers
    0.87
    eping
    0.86
    avers
    0.86
    alth
    0.84
    aning
    0.82
    asel
    0.81
    'd
    0.81
    Act Density 0.221%

    No Known Activations