INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Flower
    -0.07
    Employees
    -0.07
     heav
    -0.06
    thumbnail
    -0.06
    _compat
    -0.06
    _BINDING
    -0.06
    ,param
    -0.06
    .social
    -0.06
     подк
    -0.06
    patches
    -0.06
    POSITIVE LOGITS
    0.07
    0.06
    US
    0.06
     armed
    0.06
    0.06
    0.06
    وبی
    0.06
     emin
    0.06
     Jim
    0.06
     Tutor
    0.06
    Act Density 0.042%

    No Known Activations