INDEX
    Explanations

    words related to censorship and control of language

    New Auto-Interp
    Negative Logits
    swick
    -0.79
    amac
    -0.77
    verty
    -0.75
    ilater
    -0.75
    ptoms
    -0.73
    docker
    -0.70
    ancial
    -0.69
    itness
    -0.68
    ndra
    -0.67
    erald
    -0.66
    POSITIVE LOGITS
     cens
    1.01
     censorship
    0.89
     censor
    0.85
     censored
    0.81
    zers
    0.74
     cutter
    0.71
     levied
    0.70
     viol
    0.69
     promulg
    0.68
    cens
    0.68
    Act Density 0.027%

    No Known Activations