INDEX
Explanations
mentions of job titles or positions within a company
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1967
+0.20
0.6%
1385
+0.13
0.4%
478
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
650
+0.20
0.04
1048
+0.13
0.03
204
+0.10
0.04
Negative Logits
maneu
-1.74
increa
-1.58
inev
-1.57
effe
-1.56
depic
-1.55
thut
-1.54
fortn
-1.52
encomp
-1.52
»>
-1.50
guarante
-1.49
POSITIVE LOGITS
own
0.79
s
0.71
كومونز
0.68
للاسماء
0.67
存于互联网档案馆
0.65
biggest
0.64
own
0.61
largest
0.58
official
0.57
ability
0.57
Activations Density 0.171%