INDEX
Explanations
details about individual journeys or personal experiences
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1253
+0.10
0.3%
1013
+0.09
0.3%
1978
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
416
+0.10
0.05
1166
+0.09
0.07
295
+0.09
0.01
Negative Logits
increa
-1.87
effe
-1.81
encomp
-1.76
guarante
-1.73
fortn
-1.73
maneu
-1.72
affor
-1.71
emphat
-1.71
Juf
-1.71
erad
-1.70
POSITIVE LOGITS
career
0.97
<bos>
0.89
retirement
0.78
career
0.77
Career
0.76
careers
0.75
pursue
0.74
quit
0.72
Career
0.70
карьер
0.68
Activations Density 0.751%