INDEX
    Explanations

    phrases related to political or social activism

    references to the concept of being anti or against something

    New Auto-Interp
    Negative Logits
    ding
    -0.97
    cutting
    -0.86
    edin
    -0.83
    nets
    -0.82
    rings
    -0.81
    mable
    -0.79
    alties
    -0.78
    ells
    -0.77
    erick
    -0.77
    accompan
    -0.77
    POSITIVE LOGITS
    opsis
    0.85
    urdue
    0.80
    oxin
    0.79
    iso
    0.76
    ño
    0.73
    henko
    0.73
    Wan
    0.71
    xon
    0.69
    chio
    0.68
    ���
    0.68
    Act Density 0.060%

    No Known Activations