INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     entra
    -0.08
     ot
    -0.08
     света
    -0.08
    pipeline
    -0.07
    _PIPE
    -0.07
     pipeline
    -0.07
    rapper
    -0.07
    ebe
    -0.07
     vet
    -0.07
    рад
    -0.07
    POSITIVE LOGITS
    WL
    0.09
     Backend
    0.08
     vezi
    0.08
    sw
    0.08
     caravan
    0.08
    ieto
    0.08
    0.07
     Alle
    0.07
    Sw
    0.07
     barg
    0.07
    Act Density 0.005%

    No Known Activations