INDEX
Explanations
proper nouns and specific names, especially related to gaming, sports, and politics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1272
+0.13
0.5%
144
+0.12
0.5%
1565
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1565
+0.13
0.06
1272
+0.12
0.06
144
+0.12
0.05
Negative Logits
Gdy
-0.58
<bos>
-0.47
Przyp
-0.45
borderRadius
-0.43
Jednak
-0.43
Dlaczego
-0.43
Ostat
-0.42
Dlatego
-0.41
Jakie
-0.41
NewUrlParser
-0.40
POSITIVE LOGITS
Moderato
0.88
marte
0.86
swarovski
0.85
pican
0.85
broderie
0.82
fernando
0.82
Allegretto
0.81
milano
0.80
aquarelle
0.80
lapto
0.80
Activations Density 0.653%