INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     صوتيه
    -0.94
     Lion
    -0.91
    InjectAttribute
    -0.87
    Lion
    -0.81
     للمعارف
    -0.79
    expandindo
    -0.73
    OrNil
    -0.73
    ंदीखरीदारी
    -0.71
    Diwedd
    -0.71
     cookies
    -0.70
    POSITIVE LOGITS
    ing
    0.79
    s
    0.60
    spo
    0.56
    ی
    0.53
    sing
    0.51
    scan
    0.50
    g
    0.49
    scape
    0.49
    o
    0.49
    ING
    0.49
    Act Density 0.477%

    No Known Activations