INDEX
    Explanations

    code/math/technical

    New Auto-Interp
    Negative Logits
    markdown
    -0.06
    вся
    -0.06
    足球
    -0.06
    ression
    -0.06
     Cast
    -0.06
     spokes
    -0.06
    -0.06
     Ford
    -0.06
    nowrap
    -0.06
    시에
    -0.06
    POSITIVE LOGITS
     Liber
    0.07
    ichick
    0.07
    '";↵
    0.07
    081
    0.07
     С
    0.07
     Formatting
    0.07
    Lbl
    0.06
    0.06
    People
    0.06
     livelihood
    0.06
    Act Density 0.003%

    No Known Activations