INDEX
Explanations
starting and ending phrases in texts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
159
+0.09
0.3%
1503
+0.07
0.2%
1822
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1497
+0.09
0.03
1030
+0.07
0.02
1309
+0.07
0.02
Negative Logits
ftu
-0.87
fta
-0.86
»>
-0.85
thut
-0.83
:,,
-0.83
«<
-0.80
fince
-0.79
fte
-0.79
ftre
-0.79
Abbé
-0.77
POSITIVE LOGITS
currently
0.92
immediate
0.91
presently
0.86
current
0.83
current
0.75
today
0.73
interim
0.73
Currently
0.72
Currently
0.72
immediate
0.72
Activations Density 0.345%