INDEX
Explanations
verbs related to completing tasks or achieving goals
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.14
0.4%
2034
+0.13
0.4%
605
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.14
0.11
1013
+0.13
0.09
2044
+0.09
0.08
Negative Logits
sappi
-1.50
?...
-1.43
increa
-1.41
accla
-1.37
!...
-1.35
encomp
-1.33
emphat
-1.33
guarante
-1.32
purcha
-1.29
unlaw
-1.28
POSITIVE LOGITS
.
0.71
SPECIFIED
0.70
;
0.65
!
0.63
with
0.63
regarding
0.62
while
0.61
,
0.61
UpInside
0.61
WithMany
0.61
Activations Density 0.662%