INDEX
Explanations
instances of the letters "gs" in the text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
77
+0.12
0.7%
397
+0.11
0.6%
376
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
105
+0.12
0.01
24
+0.11
0.01
77
+0.11
0.01
Negative Logits
ı
-2.30
İ
-2.30
ĥ
-2.28
¿½
-2.24
Ĥ
-2.19
¼
-2.14
ĸ´
-2.10
Ĭ
-2.09
Ħ
-2.06
ĭ
-2.06
POSITIVE LOGITS
heet
1.91
port
1.85
loop
1.81
enos
1.74
argument
1.73
velt
1.67
chaft
1.66
grounds
1.65
iop
1.64
ios
1.63
Activations Density 0.015%