INDEX
Explanations
references to or descriptions of circles
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1472
+0.13
0.5%
316
+0.12
0.4%
228
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1472
+0.13
0.02
316
+0.12
0.02
555
+0.12
0.02
Negative Logits
vache
-0.52
Tradu
-0.52
бовь
-0.51
tache
-0.51
purtroppo
-0.51
bergen
-0.51
adesso
-0.50
frambo
-0.49
hek
-0.49
klap
-0.48
POSITIVE LOGITS
circle
1.47
circles
1.35
circle
1.28
Circle
1.26
circular
1.19
circles
1.17
circling
1.16
Circle
1.15
CIRCLE
1.12
Circles
1.11
Activations Density 0.108%