INDEX
Explanations
mentions of responsibilities or tasks that need to be completed
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1042
+0.07
0.2%
1013
+0.07
0.2%
431
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
213
+0.07
0.04
1580
+0.07
0.03
1615
+0.07
0.04
Negative Logits
increa
-1.29
unlaw
-1.25
unden
-1.20
disagre
-1.19
impra
-1.18
guarante
-1.18
encomp
-1.18
wherea
-1.13
Intere
-1.13
shenan
-1.13
POSITIVE LOGITS
busy
0.88
priorities
0.81
schedules
0.71
忙
0.68
hectic
0.67
priority
0.65
schedule
0.65
deadlines
0.64
busiest
0.63
يتيمه
0.62
Activations Density 0.423%