INDEX
Explanations
phrases related to decision-making processes and voting procedures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1499
+0.09
0.3%
1553
+0.09
0.2%
81
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
81
+0.09
0.02
1499
+0.09
0.04
2045
+0.08
0.04
Negative Logits
indestru
-0.73
pamph
-0.62
unwarran
-0.62
depic
-0.60
liberality
-0.60
encomp
-0.59
unlaw
-0.59
sophistic
-0.58
unspeak
-0.58
Unto
-0.58
POSITIVE LOGITS
votes
0.68
toscana
0.66
Hür
0.66
minimalis
0.62
infrastruktur
0.62
silikon
0.61
karet
0.61
kug
0.60
demokra
0.60
vote
0.60
Activations Density 0.271%