INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     praise
    -0.06
     вперед
    -0.06
    alignment
    -0.06
     있었다
    -0.06
    -0.06
     которое
    -0.06
    -fold
    -0.06
    Resolution
    -0.06
    ostat
    -0.06
    ghest
    -0.06
    POSITIVE LOGITS
    =$((
    0.08
    /user
    0.07
     strSql
    0.06
    (substr
    0.06
    /Input
    0.06
    '%(
    0.06
    (props
    0.06
    /mail
    0.06
    _<
    0.06
     Local
    0.06
    Act Density 0.005%

    No Known Activations