INDEX
    Explanations

    judgmental or opinionated statements about individuals and their identities

    New Auto-Interp
    Negative Logits
     مرئيه
    -0.71
    Geplaatst
    -0.61
    Jeografia
    -0.59
    ].)
    -0.59
    */}
    -0.57
     Audiodateien
    -0.56
    ="@+
    -0.55
     חיצוניים
    -0.53
     ")";
    -0.53
    */)
    -0.53
    POSITIVE LOGITS
     NSCoder
    0.58
    Nhưng
    0.57
     inderdaad
    0.56
     klingt
    0.50
    域名
    0.50
     σου
    0.50
    houding
    0.49
     Expédié
    0.49
     думать
    0.49
     पास
    0.49
    Act Density 0.147%

    No Known Activations