INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    MK
    -0.07
     disparate
    -0.06
    :)];↵
    -0.06
     tantal
    -0.06
     postpon
    -0.06
     parole
    -0.06
     yön
    -0.06
    asına
    -0.06
    obre
    -0.06
     hale
    -0.06
    POSITIVE LOGITS
    (item
    0.07
    	my
    0.07
     falling
    0.06
     можете
    0.06
    fh
    0.06
     ending
    0.06
    /style
    0.06
    addon
    0.06
     aime
    0.06
     ws
    0.06
    Act Density 0.000%

    No Known Activations