INDEX
    Explanations

    observation, assessment

    New Auto-Interp
    Negative Logits
     мала
    -0.06
     verge
    -0.06
     зали
    -0.06
    -0.06
     shout
    -0.06
    .getUser
    -0.06
     Бол
    -0.06
    ndef
    -0.06
     Authorization
    -0.06
     lipstick
    -0.06
    POSITIVE LOGITS
    +_
    0.07
     MLM
    0.06
     Một
    0.06
    adies
    0.06
    0.06
    0.06
    Disallow
    0.06
    matic
    0.06
     mdb
    0.06
     grades
    0.06
    Act Density 0.019%

    No Known Activations