INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ל
    0.78
     to
    0.71
    at
    0.70
     (
    0.67
    TestSource
    0.64
    CipherText
    0.63
     co
    0.63
    yai
    0.62
     جای
    0.62
    umum
    0.62
    POSITIVE LOGITS
     действий
    1.24
     действия
    1.04
    t
    0.94
    动作
    0.90
     действие
    0.84
    アクション
    0.84
     actions
    0.83
     action
    0.78
    шему
    0.78
    setAction
    0.77
    Act Density 0.403%

    No Known Activations