INDEX
Explanations
the mention of the word "Pr" followed by a number, specifically in the context of processes or products
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.15
0.9%
350
+0.14
0.8%
203
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
350
+0.15
0.03
172
+0.14
0.02
386
+0.13
0.01
Negative Logits
generality
-1.83
ľĵ
-1.78
eri
-1.72
ering
-1.68
wards
-1.64
lla
-1.60
shore
-1.58
Islands
-1.57
eration
-1.55
raine
-1.50
POSITIVE LOGITS
imal
1.75
ings
1.72
inters
1.71
incess
1.62
inces
1.60
iment
1.57
inter
1.49
ige
1.48
uning
1.48
aside
1.48
Activations Density 0.083%