INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _Zero
    -0.08
     Да
    -0.07
    kdir
    -0.07
    ESPN
    -0.07
     INCIDENT
    -0.06
    AZY
    -0.06
    	filter
    -0.06
     inaug
    -0.06
    ıza
    -0.06
    -0.06
    POSITIVE LOGITS
    relative
    0.07
     mRNA
    0.06
     mast
    0.06
     booty
    0.06
     kite
    0.06
     troop
    0.06
     удов
    0.06
    behavior
    0.06
     jugg
    0.06
     course
    0.06
    Act Density 0.000%

    No Known Activations