INDEX
Explanations
references to legal cases and judicial citations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
327
+0.16
0.9%
307
+0.12
0.7%
492
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
307
+0.16
0.04
327
+0.12
0.03
445
+0.12
0.03
Negative Logits
uego
-1.56
ear
-1.53
erce
-1.47
whom
-1.46
eligible
-1.45
ect
-1.45
orie
-1.40
eps
-1.39
ves
-1.36
Writers
-1.36
POSITIVE LOGITS
©
4.00
¦
3.75
¬
3.65
º
3.61
Ļª
3.58
ĭ
3.56
Į
3.52
«
3.50
ł
3.50
¶
3.48
Activations Density 0.034%