INDEX
Explanations
references to propaganda in various contexts
political propaganda and manipulation
New Auto-Interp
Negative Logits
pona
-0.48
\{\\-0.47
uris
-0.47
HasColumnType
-0.46
EClass
-0.46
etes
-0.45
Dilemma
-0.45
Cerv
-0.44
-0.44
spalle
-0.44
POSITIVE LOGITS
propaganda
1.91
Propaganda
1.76
paganda
1.73
propagan
1.38
propa
0.90
propag
0.89
пропа
0.79
宣传
0.78
Propag
0.76
indoctr
0.76
Activations Density 0.012%