INDEX
    Explanations

    a followed by descriptive words

    New Auto-Interp
    Negative Logits
    лары
    1.06
     Wh
    1.05
    -
    1.04
    领域的
    1.03
    ajer
    1.02
     departmental
    0.99
    ایت
    0.98
    orld
    0.97
    acch
    0.97
    azz
    0.97
    POSITIVE LOGITS
     guter
    1.80
     moins
    1.74
     tenho
    1.71
     posso
    1.69
     joli
    1.64
     nessuna
    1.62
     inget
    1.59
     estou
    1.59
     ottimo
    1.58
     nincs
    1.58
    Act Density 0.539%

    No Known Activations