INDEX
    Explanations

    expressions of inappropriateness or dissatisfaction regarding social behavior and comments

    inappropriate, improper, or insensitive

    New Auto-Interp
    Negative Logits
     apaixon
    -0.34
     económicas
    -0.33
     económica
    -0.33
     chargeur
    -0.33
     economía
    -0.33
     remplissage
    -0.32
     naviguant
    -0.32
     permukaan
    -0.32
     cerâmica
    -0.32
    KommentareTeilen
    -0.31
    POSITIVE LOGITS
     ſp
    0.56
     Reſ
    0.52
    ſelf
    0.52
     Majefty
    0.52
    keted
    0.51
     leaſt
    0.51
     Chriftian
    0.50
     ſei
    0.50
     ſta
    0.50
     miſ
    0.50
    Act Density 0.027%

    No Known Activations