INDEX
    Explanations

    the word "too" and its variations indicating excessiveness

    New Auto-Interp
    Negative Logits
    rael
    -0.15
    ry
    -0.15
    licht
    -0.15
     assez
    -0.14
    compat
    -0.14
    ávÄĽ
    -0.14
    phan
    -0.14
     happier
    -0.14
    rof
    -0.14
     äl
    -0.14
    POSITIVE LOGITS
     much
    0.34
     soon
    0.27
    led
    0.27
    much
    0.25
     many
    0.25
     Much
    0.25
    Much
    0.25
     late
    0.23
    oooo
    0.23
    oooooooo
    0.23
    Act Density 0.025%

    No Known Activations