INDEX
    Explanations

    please followed by action

    New Auto-Interp
    Negative Logits
     труд
    0.41
     claras
    0.38
    ">
    0.38
    しやすい
    0.37
    委会
    0.36
    /
    0.35
     کھیلنا
    0.35
     saddhim
    0.35
     infidelity
    0.35
    क्र
    0.35
    POSITIVE LOGITS
     note
    0.61
     beware
    0.57
     beachten
    0.54
     PLEASE
    0.54
     please
    0.51
     forgive
    0.51
     don
    0.50
     excuse
    0.50
    0.50
    цкі
    0.49
    Act Density 0.019%

    No Known Activations