INDEX
Explanations
mentions of specific individuals and leadership roles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
599
+0.45
1.6%
1531
+0.13
0.5%
137
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
599
+0.45
0.16
137
+0.13
0.11
1531
+0.12
0.06
Negative Logits
bewerken
-0.67
tagHelperRunner
-0.66
EconPapers
-0.60
❮
-0.59
SourceChecksum
-0.59
৳
-0.54
referrerpolicy
-0.54
ipedi
-0.52
SEDS
-0.52
NOPQRST
-0.52
POSITIVE LOGITS
embodi
0.79
squa
0.78
unlaw
0.77
»>
0.76
unwarran
0.74
unve
0.74
fup
0.71
impractica
0.71
ftu
0.70
Ecclesiastical
0.70
Activations Density 2.172%