INDEX
Explanations
conjunctions and their frequency in sentences
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
494
+0.14
0.8%
478
+0.13
0.7%
74
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
494
+0.14
0.08
50
+0.13
0.08
446
+0.12
0.05
Negative Logits
winning
-1.71
mers
-1.66
iful
-1.61
ulsion
-1.58
blogger
-1.53
former
-1.51
borne
-1.48
ching
-1.48
ugu
-1.47
chin
-1.46
POSITIVE LOGITS
rogens
1.94
others
1.59
/âĪĴ
1.58
gt
1.45
/+
1.43
its
1.38
Exercise
1.32
Wine
1.32
Markets
1.31
Hours
1.28
Activations Density 0.566%