INDEX
Explanations
mentions of time periods like decades
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
67
+0.12
0.4%
1013
+0.10
0.3%
1691
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1856
+0.12
0.04
1962
+0.10
0.04
1830
+0.10
0.03
Negative Logits
مرئيه
-0.56
cydow
-0.46
ungkin
-0.44
AutoScale
-0.44
дописавши
-0.44
Tyl
-0.44
cipar
-0.43
Marzo
-0.43
agaimana
-0.42
Hab
-0.42
POSITIVE LOGITS
decade
1.06
decade
1.06
decades
1.05
Decade
0.81
whofe
0.78
fince
0.77
unwarran
0.76
fays
0.74
withal
0.73
tolerably
0.73
Activations Density 0.059%