INDEX
Explanations
phrases related to problem-solving or advancements and progress in a project
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.11
0.3%
1013
+0.11
0.3%
2034
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.11
0.07
1183
+0.11
0.04
240
+0.10
0.04
Negative Logits
increa
-1.94
fuf
-1.78
?...
-1.75
fta
-1.74
emphat
-1.72
inev
-1.72
purcha
-1.70
effe
-1.66
NOO
-1.65
fte
-1.65
POSITIVE LOGITS
.
0.69
."/
0.66
along
0.65
ElementRef
0.64
together
0.63
via
0.62
by
0.61
ItemBackground
0.61
๙
0.59
while
0.59
Activations Density 0.291%