INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bugs
    -0.08
     breakfast
    -0.07
     ware
    -0.07
     bond
    -0.07
     brunch
    -0.07
     Wouldn
    -0.07
    لو
    -0.07
                
    -0.07
     bolts
    -0.06
    ari
    -0.06
    POSITIVE LOGITS
    Inverse
    0.07
     Держав
    0.06
    Searching
    0.06
     konu
    0.06
    -ap
    0.06
    0.06
    0.06
     negligent
    0.06
     ГО
    0.06
     Rück
    0.06
    Act Density 0.015%

    No Known Activations