INDEX
Explanations
words related to propaganda and misinformation
references to propaganda and its various forms and contexts
New Auto-Interp
Negative Logits
ergy
-0.80
heed
-0.73
cker
-0.70
berry
-0.68
Eucl
-0.68
hens
-0.68
clud
-0.68
kens
-0.68
Roses
-0.67
gger
-0.66
POSITIVE LOGITS
aganda
1.11
posters
0.86
leaflets
0.85
propaganda
0.84
dissemin
0.83
suppression
0.81
dissemination
0.80
disinformation
0.79
blitz
0.79
isSpecialOrderable
0.75
Activations Density 0.025%