INDEX
Explanations
mentions related to political theory and philosophy, specifically focusing on the concepts of power, governmentality, and control mechanisms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
872
+0.17
0.5%
304
+0.10
0.3%
1870
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
872
+0.17
0.09
490
+0.10
0.03
1097
+0.10
0.06
Negative Logits
Embal
-0.54
موس
-0.53
gyors
-0.52
afio
-0.49
physically
-0.49
COMPR
-0.48
onlyOwner
-0.48
CONCLUS
-0.48
only
-0.47
sto
-0.47
POSITIVE LOGITS
fatis
1.35
ftu
1.28
desir
1.27
nece
1.27
paff
1.24
fte
1.22
effe
1.21
fta
1.20
aen
1.19
hcm
1.19
Activations Density 0.909%