INDEX
    Explanations

    words related to serious wrongdoing or intense negative actions

    expressions of extreme wrongdoing or moral outrage

    New Auto-Interp
    Negative Logits
    ulton
    -0.79
    runner
    -0.79
    ym
    -0.74
    wrapper
    -0.71
    hner
    -0.71
    ather
    -0.68
    roe
    -0.67
    chrom
    -0.66
    runners
    -0.66
    upp
    -0.65
    POSITIVE LOGITS
     egregious
    0.95
     abuses
    0.90
     offender
    0.89
     injustice
    0.89
     heinous
    0.88
     offenders
    0.85
     violations
    0.82
     viol
    0.79
     injust
    0.76
     egreg
    0.73
    Act Density 0.009%

    No Known Activations