INDEX
Explanations
occurrences of events that are expected to happen
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1793
+0.13
0.5%
156
+0.13
0.4%
950
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1793
+0.13
0.04
492
+0.13
0.04
156
+0.12
0.04
Negative Logits
emphat
-0.75
vainly
-0.74
endeavouring
-0.73
unwarran
-0.70
indescri
-0.69
gaily
-0.69
Bartholo
-0.69
quitted
-0.69
increa
-0.68
withal
-0.68
POSITIVE LOGITS
expected
1.12
expected
1.05
Expected
1.03
EXPECT
1.02
expect
1.00
Expected
0.96
expectations
0.88
expectation
0.85
expect
0.85
expects
0.84
Activations Density 0.072%