INDEX
Explanations
phrases related to supporting a cause or belief
references to causes, ideals, and political messages
New Auto-Interp
Negative Logits
ursed
-0.65
Kers
-0.62
PIT
-0.62
inka
-0.59
Ravens
-0.59
arling
-0.59
leep
-0.59
administ
-0.58
aband
-0.58
ãĥĥãĥĪ
-0.58
POSITIVE LOGITS
geist
0.89
creed
0.87
ologies
0.85
ideals
0.85
ifice
0.83
worldview
0.77
ideology
0.76
championed
0.75
dogma
0.75
preached
0.74
Activations Density 0.210%