INDEX
    Explanations

    phrases related to censorship and suppression

    actions related to suppression and censorship

    New Auto-Interp
    Negative Logits
    ortment
    -0.78
    esides
    -0.65
    luster
    -0.65
     Suc
    -0.65
    immer
    -0.64
    ammy
    -0.64
    olds
    -0.64
    âĹ¼
    -0.63
    -------
    -0.61
    itched
    -0.61
    POSITIVE LOGITS
    ively
    0.92
    uate
    0.81
     him
    0.79
     offending
    0.78
     opposing
    0.77
    enance
    0.75
     them
    0.73
     everything
    0.72
     dissent
    0.72
     oneself
    0.71
    Act Density 0.199%

    No Known Activations