INDEX
Explanations
words related to political or social ideologies that carry a certain weight or influence
mentions of different ideologies
New Auto-Interp
Negative Logits
suspended
-0.71
Pwr
-0.65
alls
-0.64
Bucks
-0.64
round
-0.63
ires
-0.63
Browns
-0.62
Steps
-0.62
Dee
-0.61
snapped
-0.61
POSITIVE LOGITS
ideology
3.37
ideologies
2.44
ideological
2.22
Ide
2.04
ideologically
1.82
dogma
1.75
ethos
1.61
worldview
1.59
ide
1.59
orthodoxy
1.54
Activations Density 0.023%