INDEX
Explanations
terms related to engineering
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.15
0.6%
1548
+0.14
0.5%
1705
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1548
+0.15
0.02
1705
+0.14
0.02
156
+0.13
0.02
Negative Logits
guma
-0.59
kram
-0.58
hina
-0.52
katal
-0.51
saha
-0.50
karna
-0.49
pardoned
-0.48
kotor
-0.48
Skor
-0.47
simpel
-0.47
POSITIVE LOGITS
engineering
1.30
engineer
1.27
Engineering
1.24
engineers
1.23
Engineering
1.20
Engineer
1.17
engineer
1.10
Engineer
1.10
Engineers
1.10
engineering
1.09
Activations Density 0.061%