INDEX
Explanations
HTML tags and related markup syntax
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
57
+0.13
0.7%
320
+0.13
0.7%
114
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
345
+0.13
0.02
74
+0.13
0.02
445
+0.12
0.01
Negative Logits
Ł
-2.49
ĻĤ
-2.43
§
-2.41
-2.39
↵
-2.39
↵
-2.39
<|outofrange|>
-2.39
<|outofrange|>
-2.39
-2.39
-2.39
POSITIVE LOGITS
ATIONAL
1.56
oter
1.53
nd
1.53
keley
1.51
parency
1.48
istration
1.48
ijing
1.47
bert
1.44
etable
1.43
ressional
1.40
Activations Density 0.100%