INDEX
    Explanations

    phrases indicating a sequence of events, specifically events that happened just before a certain action or outcome

    pronouns, particularly focusing on the repeated mentions of "he," "she," "I," and "they."

    New Auto-Interp
    Negative Logits
     Associated
    -0.71
     Monteneg
    -0.64
     Canaver
    -0.64
     Consortium
    -0.64
     Electrical
    -0.63
     Federation
    -0.63
     Optim
    -0.61
     understatement
    -0.61
     sarc
    -0.61
     Strategy
    -0.60
    POSITIVE LOGITS
    've
    1.04
     arrived
    0.91
     started
    0.87
     began
    0.87
    're
    0.87
    hran
    0.84
     became
    0.82
    'd
    0.82
     exited
    0.82
     arrive
    0.82
    Act Density 0.136%

    No Known Activations