INDEX
    Explanations

    phrases indicating disregard or concession

    New Auto-Interp
    Negative Logits
    inge
    -0.16
    kelig
    -0.15
    bersome
    -0.14
    ýš
    -0.14
    arnation
    -0.14
    иной
    -0.14
    åħ»
    -0.14
    erguson
    -0.14
    umbing
    -0.14
    uegos
    -0.13
    POSITIVE LOGITS
    ots
    0.16
     Butter
    0.15
    297
    0.15
    ÙħÙĪÙĦ
    0.14
    atte
    0.14
    uell
    0.13
     Rim
    0.13
     latter
    0.13
    /e
    0.13
    urst
    0.13
    Act Density 0.024%

    No Known Activations