INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     travaillé
    0.68
    תוך
    0.67
    kében
    0.67
     लिखकर
    0.64
    дал
    0.63
     পদক্ষেপ
    0.63
    वर्ण
    0.62
    ]+/
    0.62
    }\|_{
    0.62
    /#{
    0.61
    POSITIVE LOGITS
     obviously
    4.77
     Obviously
    4.46
    obviously
    4.28
    Obviously
    4.20
     ovviamente
    3.78
     obviamente
    3.70
     évidemment
    3.67
     obvious
    3.63
    当然
    3.36
    當然
    3.33
    Act Density 0.648%

    No Known Activations