INDEX
Explanations
phrases related to political ideologies and concepts like dictatorship, oppression, social structures, and critique of societal norms
New Auto-Interp
Negative Logits
poke
-0.53
LOD
-0.53
($
-0.50
-$
-0.50
Alert
-0.50
topped
-0.50
tackle
-0.49
boarding
-0.49
gc
-0.48
advertising
-0.47
POSITIVE LOGITS
slightest
0.85
latter
0.77
ocratic
0.72
greatest
0.71
doctrine
0.68
virtues
0.68
ses
0.67
vast
0.67
Greeks
0.65
notion
0.65
Activations Density 14.565%