INDEX
Explanations
conjunctions and alternative phrases indicating choice or options
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
436
+0.14
0.8%
350
+0.13
0.7%
261
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
50
+0.14
0.05
494
+0.13
0.04
347
+0.12
0.04
Negative Logits
ump
-1.56
watson
-1.55
xspace
-1.49
opts
-1.47
MD
-1.32
aryng
-1.32
unge
-1.31
arynge
-1.29
verted
-1.29
ursuant
-1.28
POSITIVE LOGITS
naments
1.71
chard
1.70
contemplate
1.42
rue
1.39
equivalently
1.35
¿
1.34
vier
1.30
least
1.27
Trustee
1.27
collaborators
1.26
Activations Density 0.240%