INDEX
Explanations
descriptive words related to creativity and artistry
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.09
0.4%
220
+0.06
0.3%
68
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
415
+0.09
0.04
798
+0.06
0.03
220
+0.06
0.03
Negative Logits
<bos>
-1.38
ⓧ
-0.78
/**
-0.77
,
-0.76
also
-0.74
<?
-0.73
-0.73
displayquote
-0.73
itemize
-0.72
</tbody>
-0.71
POSITIVE LOGITS
increa
2.26
maneu
2.23
affor
2.16
impra
2.10
accla
1.97
strick
1.96
disagre
1.96
fta
1.95
stockholm
1.95
ftu
1.94
Activations Density 0.167%