INDEX
Explanations
web links and technical information like website URLs and numerical data in a document
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.23
1.1%
2019
+0.12
0.6%
1343
+0.11
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.23
0.04
1343
+0.12
0.04
1896
+0.11
0.03
Negative Logits
<bos>
-2.67
ⓧ
-1.24
/**
-1.20
intersper
-1.18
-1.14
quitted
-0.94
<?
-0.92
disbur
-0.91
forbear
-0.89
endow
-0.88
POSITIVE LOGITS
Literat
0.80
marea
0.77
ados
0.71
corrom
0.71
psicologia
0.70
Pièces
0.69
Hauteur
0.68
lanterna
0.66
Glej
0.66
Sklici
0.66
Activations Density 0.128%