INDEX
Explanations
political and societal terms related to ideology and governance
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
468
+0.09
0.3%
1061
+0.08
0.2%
766
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2046
+0.09
0.03
1061
+0.08
0.03
468
+0.08
0.03
Negative Logits
OFDb
-0.71
.*")]
-0.67
censiti
-0.66
"]);
-0.62
faceva
-0.60
']))
-0.59
)_/¯
-0.58
'])){
-0.57
"]));
-0.57
Junio
-0.57
POSITIVE LOGITS
increa
1.06
inappro
1.01
impra
1.01
outlander
0.96
kawasaki
0.95
suscep
0.94
impractica
0.93
disagre
0.93
strick
0.93
gsx
0.92
Activations Density 0.404%