INDEX
Explanations
references to historical texts and architectural elements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1871
+0.14
0.4%
872
+0.11
0.3%
509
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1188
+0.14
0.02
924
+0.11
0.05
237
+0.11
0.02
Negative Logits
<bos>
-0.75
quehanna
-0.63
CreateTagHelper
-0.61
smör
-0.61
Violon
-0.61
lapto
-0.60
rosion
-0.60
styleType
-0.59
frambo
-0.58
DoubleQuotes
-0.57
POSITIVE LOGITS
clô
0.90
étend
0.86
renforcé
0.86
librement
0.84
élar
0.83
renfer
0.82
nettement
0.82
quoique
0.82
spécialement
0.82
doté
0.81
Activations Density 0.475%