INDEX
Explanations
ingredients and quantities used in recipes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
33
+0.13
0.7%
420
+0.12
0.6%
327
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
104
+0.13
0.04
198
+0.12
0.07
253
+0.11
0.03
Negative Logits
ató
-1.62
iless
-1.53
addicted
-1.51
fiddle
-1.51
addict
-1.48
poverty
-1.43
STEM
-1.43
abil
-1.42
anta
-1.41
leep
-1.40
POSITIVE LOGITS
º
3.83
ł
3.81
↵↵
3.72
↵↵
3.72
↵
3.72
3.72
<|outofrange|>
3.72
3.72
↵
3.72
↵
3.72
Activations Density 1.234%