INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     येतात
    0.64
     می‌شوند
    0.56
     होतात
    0.55
     करतात
    0.53
     जातात
    0.52
     सांगतात
    0.52
    ходят
    0.51
     येतो
    0.50
     येते
    0.50
    します
    0.50
    POSITIVE LOGITS
     became
    0.83
     came
    0.82
     gave
    0.82
     took
    0.80
     did
    0.79
     was
    0.78
     went
    0.77
     began
    0.76
     grew
    0.73
     fell
    0.72
    Act Density 0.128%

    No Known Activations