INDEX
    Explanations

    vocabulary related to censorship or filtering

    terms related to censorship and its implications

    New Auto-Interp
    Negative Logits
     Scotia
    -0.64
    ordinary
    -0.63
    amaz
    -0.60
    holding
    -0.59
    sheet
    -0.59
    zzi
    -0.58
     Tinker
    -0.58
    addafi
    -0.57
    agne
    -0.56
     McAuliffe
    -0.55
    POSITIVE LOGITS
    chen
    0.98
    orship
    0.97
    ource
    0.96
    manship
    0.94
    hift
    0.92
    terday
    0.91
    urable
    0.90
    wear
    0.88
    CRIP
    0.85
    haw
    0.85
    Act Density 0.050%

    No Known Activations