INDEX
Explanations
expressions of repetition or ellipsis in text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.20
1.1%
127
+0.15
0.8%
470
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
5
+0.20
0.03
470
+0.15
0.03
445
+0.12
0.01
Negative Logits
alties
-1.84
izations
-1.77
alty
-1.76
ties
-1.73
acters
-1.72
icity
-1.71
acquisition
-1.66
ificates
-1.61
development
-1.61
tees
-1.60
POSITIVE LOGITS
¸
1.85
unnumbered
1.75
не
1.62
Ŀ
1.60
Ķ
1.56
behold
1.48
quo
1.44
pite
1.43
denly
1.41
jest
1.40
Activations Density 0.085%