INDEX
Explanations
references to political values and principles related to governance
New Auto-Interp
Negative Logits
ç¯
-0.07
esi
-0.07
lfw
-0.07
ertia
-0.07
559
-0.07
_RM
-0.07
ÏĥοÏħ
-0.07
avin
-0.07
åģı
-0.07
EXEMPLARY
-0.07
POSITIVE LOGITS
equality
0.08
fair
0.08
free
0.08
equal
0.07
justice
0.07
respect
0.07
freedom
0.07
democracy
0.06
liberty
0.06
bodily
0.06
Activations Density 0.047%