INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     (
    0.85
     was
    0.82
     on
    0.77
    ot
    0.73
    k
    0.69
    ut
    0.64
     at
    0.63
     ছিল
    0.63
    kosť
    0.62
    ្នក
    0.61
    POSITIVE LOGITS
    و
    1.02
    ിൽ
    0.85
    ى
    0.82
    ю
    0.80
    ی
    0.79
    ς
    0.78
    ہ
    0.77
    но
    0.77
    0.77
    ین
    0.75
    Act Density 0.681%

    No Known Activations