INDEX
Explanations
instances of the word "coh" in various forms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
148
+0.14
0.8%
43
+0.13
0.7%
111
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
111
+0.14
0.02
410
+0.13
0.01
43
+0.12
0.01
Negative Logits
ĨĴ
-2.82
·¸
-2.04
į
-1.98
IJ
-1.93
Ĥ¬
-1.91
<|outofrange|>
-1.83
↵
-1.83
-1.83
-1.83
č↵
-1.83
POSITIVE LOGITS
snaps
1.69
stown
1.63
pool
1.63
ousing
1.54
--
1.53
trips
1.49
round
1.40
orate
1.40
eston
1.39
omology
1.39
Activations Density 0.021%