INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     taken
    -0.08
     restrict
    -0.07
    taken
    -0.07
     offers
    -0.07
     policy
    -0.07
     depends
    -0.07
    _pattern
    -0.06
     disclosed
    -0.06
    _module
    -0.06
     discuss
    -0.06
    POSITIVE LOGITS
    ूबर
    0.06
     زن
    0.06
     Kết
    0.06
    ;.
    0.06
     Más
    0.06
    .fb
    0.06
    enced
    0.06
     Ödül
    0.06
    UIStoryboardSegue
    0.06
    rgyz
    0.06
    Act Density 0.004%

    No Known Activations