INDEX
Explanations
words related to censorship
references to censorship
New Auto-Interp
Negative Logits
Slot
-0.82
»Ĵ
-0.78
waukee
-0.73
RV
-0.72
Var
-0.70
Fit
-0.70
Moines
-0.69
reci
-0.68
swing
-0.68
Kal
-0.67
POSITIVE LOGITS
censorship
3.93
censor
3.15
cens
2.79
censored
2.75
cens
2.71
blacklist
1.52
repression
1.45
libel
1.42
repressive
1.37
totalitarian
1.33
Activations Density 0.026%