INDEX
Explanations
quotes or philosophical statements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.31
1.2%
394
+0.10
0.4%
1253
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1253
+0.31
0.04
394
+0.10
0.11
1744
+0.08
0.07
Negative Logits
<bos>
-1.81
hline
-0.70
public
-0.70
GroupLayout
-0.69
/**
-0.69
/*
-0.67
<tr>
-0.65
/**
-0.64
//
-0.63
</tbody>
-0.62
POSITIVE LOGITS
swarovski
1.44
ricardo
1.37
impra
1.37
véhic
1.36
sergio
1.36
affor
1.33
hairc
1.29
embodi
1.29
philanth
1.29
carrefour
1.29
Activations Density 3.131%