INDEX
Explanations
themes related to social inequality and the influence of the wealthy elite over society
New Auto-Interp
Negative Logits
ara
-0.16
icari
-0.15
ynet
-0.15
ikh
-0.14
iyan
-0.14
598
-0.14
deaux
-0.14
bourg
-0.14
esub
-0.13
dna
-0.13
POSITIVE LOGITS
control
0.36
controls
0.33
exercise
0.33
Controls
0.30
control
0.29
controlling
0.29
exerc
0.27
controls
0.27
Control
0.27
kontrol
0.27
Activations Density 0.191%