INDEX
    Explanations

    mentions of pronouns related to individuals and their actions or states

    New Auto-Interp
    Negative Logits
    é¾įå¥ij士
    -0.64
     Vanity
    -0.64
     Innocent
    -0.62
     Observer
    -0.61
     Yesterday
    -0.60
     NK
    -0.60
     Kirin
    -0.59
     NP
    -0.59
     Oriental
    -0.59
     Nin
    -0.58
    POSITIVE LOGITS
    'll
    1.11
    'd
    1.08
     proceeded
    1.06
     reverted
    1.02
     withdrew
    1.00
     recons
    1.00
     resumed
    0.99
     realized
    0.98
     retreated
    0.97
     encount
    0.96
    Act Density 0.144%

    No Known Activations