INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     masaje
    1.25
     на
    1.21
     التي
    1.21
     في
    1.21
     وت
    1.20
     بم
    1.19
     يح
    1.17
     من
    1.17
     ر
    1.17
    🏩
    1.16
    POSITIVE LOGITS
    1
    0.83
    Se
    0.72
    Third
    0.72
    Sign
    0.72
    Here
    0.71
    0
    0.70
    Small
    0.70
    Beyond
    0.69
     새로운
    0.69
    Standard
    0.69
    Act Density 0.000%

    No Known Activations