INDEX
    Explanations

    comparative and superlative terms indicating improvement or preference

    New Auto-Interp
    Negative Logits
    ATO
    -0.17
    ato
    -0.16
    duk
    -0.16
    ittle
    -0.15
    engo
    -0.15
    avan
    -0.14
    zend
    -0.14
    ieri
    -0.14
    mere
    -0.13
    zeich
    -0.13
    POSITIVE LOGITS
    -su
    0.21
    idge
    0.20
     avoided
    0.18
     suited
    0.18
     served
    0.16
     err
    0.15
    Su
    0.15
    suite
    0.14
     suicide
    0.14
     sticking
    0.14
    Act Density 0.058%

    No Known Activations