INDEX
Explanations
references to watching cartoons on a Saturday morning and the enjoyment associated with it
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.15
0.5%
1842
+0.12
0.4%
198
+0.12
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1801
+0.15
0.04
845
+0.12
0.04
198
+0.12
0.04
Negative Logits
PCell
-0.67
Furthermore
-0.61
Moreover
-0.60
Additionally
-0.59
Moreover
-0.58
])):
-0.57
Cormack
-0.57
Furthermore
-0.57
Dermott
-0.56
Ultimately
-0.56
POSITIVE LOGITS
indestru
1.18
shenan
1.15
increa
1.11
scrat
1.10
impra
1.06
excru
1.06
viciss
1.05
fluo
1.05
affez
1.04
Messieurs
1.03
Activations Density 0.423%