INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     VN
    -0.06
    .tools
    -0.06
     μπορού
    -0.06
     празд
    -0.06
     blatant
    -0.06
     oxidative
    -0.06
     мінім
    -0.06
    Reverse
    -0.06
    .Round
    -0.06
    しか
    -0.06
    POSITIVE LOGITS
    _blog
    0.08
    _dash
    0.07
     лечения
    0.07
    ạt
    0.07
     surgeons
    0.07
    0.06
    arked
    0.06
    0.06
     Delaware
    0.06
    %"
    0.06
    Act Density 0.030%

    No Known Activations