INDEX
    Explanations

    pronouns followed by verbs indicating action

    pronouns and references to groups of individuals

    New Auto-Interp
    Negative Logits
    ahime
    -0.74
     RTX
    -0.65
    ãĥ¼ãĥĨ
    -0.63
    Correct
    -0.62
    00000
    -0.61
    =-=-
    -0.60
    rium
    -0.59
    bridge
    -0.58
    Press
    -0.58
    politics
    -0.58
    POSITIVE LOGITS
     were
    1.00
     are
    0.98
     perished
    0.92
     relate
    0.84
     belong
    0.84
     involve
    0.82
     consisted
    0.81
     originated
    0.81
     reside
    0.81
     have
    0.77
    Act Density 0.072%

    No Known Activations