INDEX
    Explanations

    phrases related to negative events or controversial topics

    instances of the word "the."

    New Auto-Interp
    Negative Logits
    leeve
    -0.72
    IFA
    -0.67
     ItemLevel
    -0.67
    Slot
    -0.67
    iffe
    -0.65
    click
    -0.65
    cture
    -0.65
    oken
    -0.63
    MK
    -0.62
    *
    -0.62
    POSITIVE LOGITS
     ensuing
    1.35
     resultant
    1.24
     resulting
    1.19
     accompanying
    1.17
     remainder
    1.14
     consequ
    1.12
     slightest
    1.09
     vast
    1.06
     latter
    1.06
     entire
    1.04
    Act Density 0.307%

    No Known Activations