INDEX
Explanations
misconceptions and beliefs regarding power dynamics in society and governance
New Auto-Interp
Negative Logits
ConstraintMaker
-0.54
Controllo
-0.48
AddTagHelper
-0.46
chóng
-0.45
pleaſure
-0.45
Autorisations
-0.45
ghed
-0.44
Reſ
-0.44
المناصب
-0.43
CURIAM
-0.43
POSITIVE LOGITS
mistakenly
0.57
erroneously
0.54
misconceptions
0.50
confused
0.50
misconception
0.50
wrongly
0.49
misunder
0.48
miscon
0.47
thinking
0.47
confuse
0.45
Activations Density 0.594%