INDEX
Explanations
steps and ingredients in a recipe
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
876
+0.16
0.5%
906
+0.13
0.4%
736
+0.12
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.16
0.04
474
+0.13
0.02
1179
+0.12
0.01
Negative Logits
monaster
-1.04
Wikisource
-0.84
Milán
-0.82
ideolog
-0.81
Pediat
-0.79
Rektor
-0.79
Siria
-0.78
valla
-0.77
Occidente
-0.74
Sankt
-0.73
POSITIVE LOGITS
increa
1.00
yoda
0.99
impra
0.95
affor
0.95
snoopy
0.94
pollut
0.94
unve
0.91
shenan
0.91
simpsons
0.90
jurassic
0.89
Activations Density 0.105%