INDEX
    Explanations

    words related to exaggeration or simplification

    instances of the word "overs" or its variations, indicating a focus on the concept of oversimplification

    New Auto-Interp
    Negative Logits
     arc
    -0.71
     Nanto
    -0.68
     motions
    -0.63
     corridor
    -0.62
     Sho
    -0.62
     Dot
    -0.62
    Guard
    -0.62
     utmost
    -0.61
     theater
    -0.61
     tooth
    -0.60
    POSITIVE LOGITS
     overs
    1.29
    impl
    1.14
    aturated
    0.98
    lap
    0.95
    leep
    0.90
    amples
    0.86
    icro
    0.84
    laughter
    0.80
    ummer
    0.78
    olicited
    0.76
    Act Density 0.005%

    No Known Activations