INDEX
    Explanations

    responsibilities and abilities

    New Auto-Interp
    Negative Logits
    s
    1.46
    age
    1.23
    w
    1.16
    2
    1.16
    ada
    1.11
    ре
    1.10
    f
    1.10
    z
    1.07
    in
    1.06
    3
    1.06
    POSITIVE LOGITS
     
    1.36
    ก็
    1.18
    گیرد
    1.18
     been
    1.16
    ラの
    1.13
    ائیں۔
    1.03
     ovo
    1.01
     осуществляется
    1.01
     može
    1.00
     i
    0.97
    Act Density 0.000%

    No Known Activations