INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     થય
    0.42
     seduced
    0.40
    မည်
    0.40
    োপাধ
    0.39
    0.39
    <unused339>
    0.38
    czak
    0.37
    <unused555>
    0.36
     మరింత
    0.36
     aswell
    0.35
    POSITIVE LOGITS
     pertama
    0.97
     पहला
    0.84
     первый
    0.81
    0.77
     первое
    0.75
     primeira
    0.75
     Pertama
    0.75
     первом
    0.74
     eerste
    0.73
    0.73
    Act Density 0.178%

    No Known Activations