INDEX
Explanations
references to the human body and its functions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
356
+0.16
0.9%
476
+0.11
0.6%
407
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
356
+0.16
0.03
93
+0.11
0.03
128
+0.11
0.01
Negative Logits
ª
-2.38
ĩ
-2.09
Ī
-1.91
ĨĴ
-1.91
ľĵ
-1.79
§
-1.78
ĵ
-1.77
behalf
-1.75
Ĭ
-1.72
Ĵ
-1.68
POSITIVE LOGITS
guard
2.08
guards
2.03
builder
1.97
weight
1.88
work
1.83
piece
1.81
bone
1.75
pieces
1.70
cavity
1.69
bones
1.67
Activations Density 0.101%