INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     باشید
    -0.07
    (contact
    -0.07
     нічого
    -0.06
    .runner
    -0.06
    MK
    -0.06
    yang
    -0.06
    /apps
    -0.06
     bahwa
    -0.06
    (gs
    -0.06
    POSITIVE LOGITS
     Wisconsin
    0.07
    _equ
    0.07
     Outcome
    0.06
    lerin
    0.06
     effective
    0.06
     discriminatory
    0.06
     Brewing
    0.06
     Combine
    0.06
    lahoma
    0.06
    wizard
    0.06
    Act Density 0.019%

    No Known Activations