INDEX
    Explanations

    phrases that refer to superlatives and rankings

    the repeated use of the word "the."

    New Auto-Interp
    Negative Logits
    illion
    -0.74
     partake
    -0.72
     assume
    -0.70
    ambo
    -0.70
    dale
    -0.68
    OTA
    -0.67
    FILE
    -0.66
    rand
    -0.65
    riot
    -0.64
    render
    -0.64
    POSITIVE LOGITS
     easiest
    1.14
     strongest
    0.99
     safest
    0.94
     hardest
    0.94
     toughest
    0.92
     opposite
    0.91
     simplest
    0.90
     same
    0.90
     greatest
    0.86
     largest
    0.85
    Act Density 0.072%

    No Known Activations