INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Desktop
    -0.07
     Col
    -0.07
     труд
    -0.07
     [];↵↵
    -0.06
     counts
    -0.06
    [];↵↵
    -0.06
     OCD
    -0.06
     krát
    -0.06
     congen
    -0.06
    typ
    -0.06
    POSITIVE LOGITS
    ìm
    0.07
    الع
    0.07
    .netty
    0.06
     rain
    0.06
    0.06
    UM
    0.06
     Smash
    0.06
    DOI
    0.06
     organizing
    0.06
     freshman
    0.06
    Act Density 0.001%

    No Known Activations