INDEX
Explanations
mentions of specific operating systems and software-related terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
198
+0.15
0.9%
346
+0.13
0.8%
292
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
346
+0.15
0.08
345
+0.13
0.07
475
+0.13
0.05
Negative Logits
quo
-1.84
infertility
-1.76
death
-1.53
icable
-1.47
agenda
-1.46
warranted
-1.42
ambitious
-1.41
career
-1.40
DESC
-1.32
autoimmune
-1.29
POSITIVE LOGITS
ooth
1.68
]];
1.67
stic
1.61
Ń
1.58
¢
1.58
.]{}1.57
inger
1.57
.]
1.51
cdn
1.51
assen
1.48
Activations Density 1.278%