INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Didn
    0.80
     hasn
    0.79
    Didn
    0.79
     wouldn
    0.78
     করেনি
    0.77
     isn
    0.77
     করেননি
    0.75
     wasn
    0.68
     করছেন
    0.68
    Wouldn
    0.68
    POSITIVE LOGITS
     do
    0.82
    до
    0.73
     ڈ
    0.70
    do
    0.65
    0.65
     до
    0.63
    דו
    0.59
     Do
    0.57
     د
    0.56
    0.56
    Act Density 0.120%

    No Known Activations