INDEX
    Explanations

    pronouns followed by a verb

    instances of pronouns and their references in context

    New Auto-Interp
    Negative Logits
    igate
    -0.69
    screen
    -0.68
    priv
    -0.67
    tnc
    -0.66
    hart
    -0.66
    orse
    -0.66
    TT
    -0.66
    代
    -0.65
     Forty
    -0.65
    Opening
    -0.65
    POSITIVE LOGITS
     nonetheless
    1.45
     nevertheless
    1.42
     persisted
    0.99
     still
    0.95
     alas
    0.91
     didnt
    0.90
    'll
    0.89
     fortunately
    0.88
     certainly
    0.88
     agre
    0.87
    Act Density 0.306%

    No Known Activations