INDEX
    Explanations

    arguably the most well-known

    New Auto-Interp
    Negative Logits
    ではなく
    0.39
    ってください
    0.38
    のではなく
    0.38
    iceps
    0.33
     တယ်
    0.32
    ということ
    0.32
    の値
    0.32
     હજાર
    0.31
    ということです
    0.31
    adım
    0.31
    POSITIVE LOGITS
     arguably
    2.36
     probably
    2.16
    Probably
    2.11
     Probably
    2.06
     perhaps
    2.05
    probably
    2.00
    Perhaps
    1.83
     talvez
    1.83
     quizás
    1.80
    perhaps
    1.77
    Act Density 0.460%

    No Known Activations