INDEX
Explanations
the occurrence of the word "lu."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
162
+0.14
0.8%
434
+0.13
0.7%
376
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
258
+0.14
0.02
56
+0.13
0.01
203
+0.12
0.01
Negative Logits
hift
-1.72
true
-1.58
hin
-1.54
smooth
-1.47
noisy
-1.45
'?"
-1.45
surjective
-1.44
descend
-1.43
complain
-1.40
taient
-1.39
POSITIVE LOGITS
ĻĤ
2.40
©
2.25
ī
2.24
ª
2.09
²
2.01
Ľ
1.95
bourne
1.90
®
1.84
ford
1.76
¯
1.75
Activations Density 3.608%