INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     iter
    -0.07
    257
    -0.07
     fran
    -0.06
     delim
    -0.06
     ded
    -0.06
    flatten
    -0.06
     sharper
    -0.06
    -down
    -0.06
    obre
    -0.06
     распред
    -0.06
    POSITIVE LOGITS
     Code
    0.06
     Boeing
    0.06
    STEM
    0.06
     修改
    0.06
     Engineer
    0.06
     chơi
    0.06
     News
    0.06
     MX
    0.06
     Drone
    0.06
     TY
    0.06
    Act Density 0.001%

    No Known Activations