INDEX
Explanations
phrases related to learning techniques or exercises
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.10
0.3%
1577
+0.08
0.2%
876
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2044
+0.10
0.08
83
+0.08
0.04
1415
+0.07
0.02
Negative Logits
dises
-1.10
Keny
-1.08
dispen
-1.03
Augu
-0.99
Khart
-0.95
Juf
-0.94
ausp
-0.94
sii
-0.89
Rumania
-0.89
seiz
-0.89
POSITIVE LOGITS
mastered
0.83
mastering
0.79
skill
0.77
mastery
0.75
practice
0.73
beginner
0.71
proficiency
0.69
practicing
0.68
practiced
0.68
skills
0.68
Activations Density 0.811%