INDEX
    Explanations

    expressions of disagreement and argumentation

    claims and statements regarding political or social issues

    New Auto-Interp
    Negative Logits
    OTUS
    -0.78
    Bonus
    -0.71
     Himself
    -0.69
    wn
    -0.66
    ãĤ¼ãĤ¦ãĤ¹
    -0.65
    odiac
    -0.65
    avis
    -0.64
    otion
    -0.64
    alt
    -0.63
    hyde
    -0.63
    POSITIVE LOGITS
     unfair
    1.08
     unfairly
    0.95
     undue
    0.77
     inadequate
    0.76
     misrepresent
    0.75
     misleading
    0.75
     discriminatory
    0.75
     loopholes
    0.74
     threatened
    0.71
     unjust
    0.71
    Act Density 0.223%

    No Known Activations