INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     USDA
    -0.08
     Kef
    -0.08
     Noticias
    -0.07
     Bitmap
    -0.07
     spac
    -0.07
     অন্যতম
    -0.07
     indoor
    -0.07
     iconic
    -0.07
     Bila
    -0.07
    ديد
    -0.07
    POSITIVE LOGITS
     harassment
    0.10
    айтесь
    0.09
     oppression
    0.09
     общения
    0.08
     обращаться
    0.08
    amate
    0.08
     намер
    0.08
     anguish
    0.08
    Mocks
    0.08
    assment
    0.08
    Act Density 0.002%

    No Known Activations