INDEX
Explanations
URLs and code snippets related to software and technology setups
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1177
+0.14
0.4%
1445
+0.13
0.4%
1967
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1445
+0.14
0.05
1003
+0.13
0.02
203
+0.12
0.03
Negative Logits
getLastName
-0.60
«
-0.60
“
-0.56
]:
-0.55
„
-0.54
-0.52
"
-0.52
「
-0.52
<eos>
-0.52
‘
-0.51
POSITIVE LOGITS
saluti
1.15
sappi
1.14
soggior
1.13
Luglio
1.10
quoique
1.09
appunt
1.07
simplif
1.06
Ottobre
1.05
scrat
1.05
parteci
1.03
Activations Density 0.195%