INDEX
    Explanations

    sentences containing expressions of arrogance and disrespect towards authority figures

    New Auto-Interp
    Negative Logits
    IsMutable
    -0.65
    inck
    -0.59
     juu
    -0.56
     nahilalakip
    -0.56
    المكان
    -0.56
     esternos
    -0.56
    TagHelper
    -0.55
     squ
    -0.55
    Билгалдахарш
    -0.55
    ارف
    -0.55
    POSITIVE LOGITS
     patriots
    0.67
    Filmographie
    0.65
     popoli
    0.60
     comunista
    0.59
     negroes
    0.57
     Moslem
    0.57
     EdgeInsets
    0.55
     egli
    0.53
    PreferredItem
    0.52
     mnoho
    0.52
    Act Density 0.349%

    No Known Activations