INDEX
    Explanations

    statements expressing disbelief or criticism regarding societal issues

    New Auto-Interp
    Negative Logits
     respectivement
    -0.51
     fond
    -0.42
     Ruhm
    -0.42
    Так
    -0.41
    ioutil
    -0.41
     genauer
    -0.41
     ao
    -0.40
    分别
    -0.40
    anz
    -0.40
     laut
    -0.39
    POSITIVE LOGITS
     BoxFit
    0.88
     beginnetje
    0.83
     InputDecoration
    0.76
     NUKAT
    0.75
    XmlAccessType
    0.74
    uxxxx
    0.74
    Personensuche
    0.73
     تانيه
    0.72
    theless
    0.71
    GEBURTSDATUM
    0.71
    Act Density 0.234%

    No Known Activations