INDEX
    Explanations

    expressions related to politics and opinions

    expressions of conflict and opposition

    New Auto-Interp
    Negative Logits
     âĢº
    -0.68
    )",
    -0.67
    Enlarge
    -0.66
    Previously
    -0.64
    "),
    -0.63
    ),"
    -0.63
    ")
    -0.63
    Initially
    -0.62
    ","
    -0.62
    earable
    -0.60
    POSITIVE LOGITS
     coward
    0.78
     goddamn
    0.74
     damned
    0.74
     patri
    0.72
     Genocide
    0.72
     hypocritical
    0.71
     dehuman
    0.71
     hypocrisy
    0.71
     fools
    0.70
     fucking
    0.70
    Act Density 2.242%

    No Known Activations