INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Transformation
    -0.06
     align
    -0.06
    040
    -0.06
    ignore
    -0.06
     attractions
    -0.06
     xPos
    -0.06
     miệng
    -0.06
    .On
    -0.06
     upright
    -0.06
    ******↵
    -0.06
    POSITIVE LOGITS
    /books
    0.06
     педагог
    0.06
    0.06
    {return
    0.06
    (records
    0.06
    .Dispose
    0.06
    0.06
     wannonce
    0.06
     bieten
    0.06
    say
    0.06
    Act Density 0.006%

    No Known Activations