INDEX
Explanations
logical conjunctions and alternatives in a sequence
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
485
+0.12
0.7%
88
+0.12
0.7%
500
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
151
+0.12
0.30
69
+0.12
0.27
434
+0.11
0.27
Negative Logits
ĻĤ
-2.64
Ļª
-2.11
"}](#
-2.07
¿½
-1.89
idium
-1.86
icago
-1.70
ĺ
-1.67
©
-1.62
Ģ
-1.59
£
-1.58
POSITIVE LOGITS
anything
1.53
multiple
1.49
marginally
1.48
longer
1.46
his
1.46
least
1.46
trial
1.42
Django
1.41
either
1.41
excessive
1.40
Activations Density 3.461%