INDEX
Explanations
terms related to the evolution of different subjects, including code, psychology, and political views
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1379
+0.13
0.5%
1865
+0.12
0.4%
486
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1379
+0.13
0.03
276
+0.12
0.03
1865
+0.10
0.02
Negative Logits
Áng
-0.68
Rubén
-0.57
Justo
-0.52
Héctor
-0.50
Cabe
-0.48
Personne
-0.48
Compañ
-0.46
Lunes
-0.45
abriu
-0.45
frastructure
-0.45
POSITIVE LOGITS
evolution
1.18
Evolution
1.13
evolution
1.12
evolu
1.12
Evolution
1.11
evolve
1.11
evolutionary
0.99
evolved
0.96
evolve
0.96
evolves
0.95
Activations Density 0.085%