INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ،
    4.26
    ,
    3.89
    3.37
    、「
    3.11
    ®,
    2.77
    ,—
    2.77
    、“
    2.70
    -,
    2.66
    ّ
    2.62
    :
    2.58
    POSITIVE LOGITS
     yani
    2.64
     yada
    2.56
     ie
    2.39
     보면은
    2.38
     which
    2.38
     które
    2.35
     wobei
    2.35
     namely
    2.34
     atleast
    2.33
     albeit
    2.33
    Act Density 2.589%

    No Known Activations