INDEX
Explanations
references to hands or palm-related imagery
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
148
+0.12
0.7%
104
+0.12
0.6%
448
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
448
+0.12
0.01
104
+0.12
0.01
409
+0.11
0.01
Negative Logits
illance
-1.58
rian
-1.56
rians
-1.51
mia
-1.51
simplest
-1.45
ĥ
-1.45
rations
-1.38
duction
-1.38
jection
-1.37
fections
-1.36
POSITIVE LOGITS
notes
1.95
istry
1.89
istor
1.85
ilion
1.75
stone
1.75
istically
1.74
ier
1.71
caster
1.69
chair
1.65
iers
1.65
Activations Density 0.012%