INDEX
Explanations
references to technical documentation and forums related to specific topics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
331
+0.18
1.0%
1328
+0.16
0.9%
1068
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1363
+0.18
0.05
1741
+0.16
0.00
1328
+0.14
0.05
Negative Logits
<bos>
-0.87
DockStyle
-0.72
osoba
-0.63
complexType
-0.62
fortawesome
-0.61
Fer
-0.60
jspb
-0.59
nakalista
-0.58
ositol
-0.58
UnknownFields
-0.57
POSITIVE LOGITS
affor
1.38
unspeak
1.34
reluct
1.30
stockholm
1.23
impra
1.22
apprehen
1.22
🤣🤣
1.21
inev
1.17
shenan
1.16
increa
1.16
Activations Density 0.695%