INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    et
    1.42
    an
    1.41
    el
    1.33
     множе
    1.29
    aner
    1.28
    it
    1.23
    ar
    1.19
    elom
    1.18
    ن
    1.18
    νες
    1.17
    POSITIVE LOGITS
    晚上
    1.46
    1.28
    break
    1.25
     вечером
    1.22
     pihak
    1.20
     giorno
    1.19
     واللي
    1.18
    ुक्त
    1.18
     рождения
    1.17
    $).
    1.17
    Act Density 0.014%

    No Known Activations