INDEX
Explanations
themes related to political priorities and the balance between individualism and communal responsibility
New Auto-Interp
Negative Logits
ohn
-0.17
ta
-0.16
791
-0.16
enco
-0.16
enc
-0.16
iba
-0.15
imli
-0.15
ign
-0.15
vid
-0.14
991
-0.14
POSITIVE LOGITS
priorities
0.23
priority
0.23
prioritize
0.19
interests
0.18
concerns
0.18
safety
0.17
priority
0.17
.priority
0.17
ÏĢοÏį
0.16
afety
0.16
Activations Density 0.175%