INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _tpl
    -0.08
     povin
    -0.07
    rn
    -0.07
     Saturdays
    -0.07
     FT
    -0.07
     nerve
    -0.06
    .tiles
    -0.06
     squirrel
    -0.06
    answers
    -0.06
    Dick
    -0.06
    POSITIVE LOGITS
     заходів
    0.06
    /admin
    0.06
     ABD
    0.06
    (dir
    0.06
     Bin
    0.06
    (strict
    0.06
    0.06
     VIP
    0.06
    .Butter
    0.06
    myp
    0.06
    Act Density 0.031%

    No Known Activations