INDEX
    Explanations

    correct, right

    New Auto-Interp
    Negative Logits
    anj
    -0.08
    anju
    -0.07
    .cancel
    -0.07
    .M
    -0.07
    úd
    -0.07
    anzeigen
    -0.07
     encanta
    -0.07
    -0.07
     presidential
    -0.07
    anja
    -0.07
    POSITIVE LOGITS
    -ish
    0.10
     eventual
    0.08
    dark
    0.08
     dark
    0.08
     eventualmente
    0.08
     sorta
    0.08
     Applic
    0.08
     parcialmente
    0.08
     deer
    0.08
     Dark
    0.07
    Act Density 0.025%

    No Known Activations