INDEX
    Explanations

    references to negative events or actions

    the repeated use of the word "the" in context

    New Auto-Interp
    Negative Logits
     again
    -0.78
    aja
    -0.73
    achus
    -0.71
     instead
    -0.67
    worn
    -0.66
     whilst
    -0.66
    ply
    -0.65
    ache
    -0.65
    tle
    -0.65
     whenever
    -0.64
    POSITIVE LOGITS
     aforementioned
    1.22
     latter
    1.20
    ses
    1.06
     same
    1.05
     slightest
    1.00
     greatest
    1.00
     latest
    0.99
     Clintons
    0.95
     entirety
    0.90
     respective
    0.89
    Act Density 0.666%

    No Known Activations