INDEX
Explanations
themes related to power dynamics and the influence of ruling classes in society
New Auto-Interp
Negative Logits
Tick
-0.16
isson
-0.16
onica
-0.15
tick
-0.15
Bour
-0.15
Tick
-0.14
ERRU
-0.14
uzzi
-0.14
icus
-0.14
лиÑĨ
-0.14
POSITIVE LOGITS
/power
0.19
control
0.18
(power
0.18
(control
0.17
power
0.17
control
0.17
power
0.16
influence
0.15
-control
0.15
Influence
0.15
Activations Density 0.183%