INDEX
Explanations
conjunctions and phrases that express connection or addition
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.19
1.1%
23
+0.12
0.7%
454
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
402
+0.19
0.11
146
+0.12
0.11
494
+0.11
0.09
Negative Logits
stown
-1.88
yourself
-1.81
feit
-1.72
mate
-1.54
ungs
-1.53
yourselves
-1.53
matter
-1.53
himself
-1.52
idium
-1.50
acher
-1.49
POSITIVE LOGITS
hence
1.92
consequent
1.85
/)
1.74
subsequent
1.72
WHM
1.63
consequently
1.59
eventual
1.59
yes
1.56
occasional
1.55
edes
1.50
Activations Density 1.062%