INDEX
Explanations
words related to suppression of dissent
references to dissent and actions related to it
New Auto-Interp
Negative Logits
onut
-0.69
Tycoon
-0.67
stakes
-0.65
ammy
-0.65
GENERAL
-0.63
ohyd
-0.62
illac
-0.61
onz
-0.60
Mineral
-0.59
ategory
-0.59
POSITIVE LOGITS
ers
1.01
dissent
0.90
ible
0.85
iates
0.84
aloud
0.81
ership
0.80
ously
0.77
ively
0.77
rained
0.75
iating
0.75
Activations Density 0.019%