INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ností
    -0.07
     имеют
    -0.06
    ede
    -0.06
    .main
    -0.06
     Worldwide
    -0.06
     resp
    -0.06
     --------------------------------
    -0.06
    تبر
    -0.06
    281
    -0.06
     kann
    -0.06
    POSITIVE LOGITS
     abusing
    0.08
    _WRONG
    0.07
     OnTrigger
    0.07
    moving
    0.07
     Aging
    0.07
     important
    0.06
    library
    0.06
    uding
    0.06
    endimento
    0.06
    ­ing
    0.06
    Act Density 0.004%

    No Known Activations