INDEX
    Explanations

    descriptive phrases followed by classifiers

    New Auto-Interp
    Negative Logits
    ना
    0.46
    ä
    0.43
    ف
    0.43
     realist
    0.42
    ع
    0.42
    وا
    0.42
    扱う
    0.41
    ه
    0.40
    0.40
    ز
    0.39
    POSITIVE LOGITS
    0.52
    allowSlide
    0.48
     balanceOf
    0.47
    issan
    0.46
     కార్యక్రమంలో
    0.45
     HTMLSc
    0.45
     reinforce
    0.44
     BrowserWindow
    0.43
     délé
    0.43
    undered
    0.43
    Act Density 0.003%

    No Known Activations