INDEX
    Explanations

    proper nouns, particularly names of places, organizations, and people

    instances of the word "The."

    New Auto-Interp
    Negative Logits
    .ãĢį
    -0.71
    .}
    -0.67
    ment
    -0.67
    itiz
    -0.61
    enance
    -0.60
     anyway
    -0.60
    SPONSORED
    -0.59
     upside
    -0.57
    onite
    -0.57
     anyways
    -0.57
    POSITIVE LOGITS
     The
    2.31
     This
    1.43
    The
    1.39
     THE
    1.36
     There
    1.35
     When
    1.34
     These
    1.30
     It
    1.29
     Those
    1.29
     Another
    1.29
    Act Density 0.257%

    No Known Activations