INDEX
    Explanations

    discriminatory language and attitudes towards race and work ethics

    New Auto-Interp
    Negative Logits
    IntoConstraints
    -0.62
    Tembelea
    -0.54
    parsedMessage
    -0.54
     Савезне
    -0.50
    Datuak
    -0.50
    @@@@@
    -0.50
    Personendaten
    -0.49
    GEBURTSDATUM
    -0.49
     CreateTagHelper
    -0.48
    gonic
    -0.48
    POSITIVE LOGITS
     lazy
    1.94
     laziness
    1.74
    lazy
    1.57
     Lazy
    1.55
    Lazy
    1.48
     indol
    1.41
     lazily
    1.35
     inaction
    1.34
     slack
    1.30
     apathy
    1.29
    Act Density 0.799%

    No Known Activations