INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0.43
     बताना
    0.43
    roga
    0.43
    0.42
     asegurarse
    0.41
    லைப்
    0.40
    是将
    0.40
    ຮັບ
    0.40
     помога
    0.39
     estaven
    0.39
    POSITIVE LOGITS
    导致
    1.95
    導致
    1.86
     منجر
    1.80
     resulting
    1.67
     resulted
    1.67
     causing
    1.60
     menyebabkan
    1.52
     leads
    1.45
    resulting
    1.42
     باعث
    1.38
    Act Density 0.045%

    No Known Activations