INDEX
Explanations
descriptions of specific events that are being witnessed or experienced by individuals
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1535
+0.25
0.8%
2034
+0.23
0.8%
382
+0.19
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.25
0.10
1535
+0.23
0.07
195
+0.19
0.06
Negative Logits
perfon
-1.24
disagre
-1.24
embodi
-1.21
emphat
-1.21
unwarran
-1.17
viciss
-1.15
guarante
-1.15
affor
-1.13
perfet
-1.11
increa
-1.09
POSITIVE LOGITS
Then
0.96
They
0.84
After
0.83
However
0.82
Once
0.78
During
0.77
Eventually
0.77
Unfortunately
0.77
While
0.77
When
0.76
Activations Density 0.345%