INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    ك
    0.72
    其他
    0.61
    ين
    0.60
    يا
    0.59
    0.57
    ла
    0.55
    كار
    0.55
    ات
    0.54
    اري
    0.54
    ת
    0.53
    POSITIVE LOGITS
    ۰
    0.59
    0
    0.51
    ]
    0.46
    )
    0.41
    ;
    0.41
    0.41
     for
    0.38
    ਿਆਂ
    0.38
    >
    0.38
     대상으로
    0.38
    Act Density 6.832%

    No Known Activations