INDEX
Explanations
phrases related to geopolitical viewpoints, particularly focused on contrasting Western and anti-Western perspectives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.12
0.3%
1253
+0.10
0.3%
764
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.12
0.05
752
+0.10
0.04
766
+0.09
0.03
Negative Logits
juft
-0.70
thut
-0.69
muft
-0.69
feen
-0.68
fhould
-0.67
faid
-0.66
obfer
-0.66
foon
-0.65
ftill
-0.65
:,,
-0.65
POSITIVE LOGITS
vedo
0.60
reality
0.55
},[])
0.54
'&:
0.52
sstream
0.51
marginHorizontal
0.50
procedere
0.50
graphicx
0.48
glMatrixMode
0.48
stdarg
0.48
Activations Density 0.281%