INDEX
    Explanations

    phrases indicating authorship or previous mention in a written piece

    first-person personal pronouns and statements

    New Auto-Interp
    Negative Logits
    ////
    -0.73
    Choice
    -0.66
    UCT
    -0.65
    OUP
    -0.63
    jection
    -0.62
    Favorite
    -0.62
    orses
    -0.62
     Requ
    -0.62
    Journal
    -0.62
    letters
    -0.61
    POSITIVE LOGITS
     mentioned
    0.94
     alluded
    0.88
     progressed
    0.87
     stated
    0.85
     explained
    0.85
     noted
    0.84
     pointed
    0.84
     progresses
    0.84
     discussed
    0.79
     recounted
    0.78
    Act Density 0.056%

    No Known Activations