INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ите
    -0.07
    وری
    -0.07
    ['_
    -0.06
     іншими
    -0.06
    xz
    -0.06
     drown
    -0.06
    173
    -0.06
    机场
    -0.06
    imde
    -0.06
    normalize
    -0.06
    POSITIVE LOGITS
    -cookie
    0.08
     accompl
    0.07
    _Settings
    0.06
     misguided
    0.06
     generics
    0.06
    ‌پدیا
    0.06
     adul
    0.06
     Asi
    0.06
     Webcam
    0.06
     Fetish
    0.06
    Act Density 0.004%

    No Known Activations