INDEX
Explanations
references to geographical locations and institutions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
419
+0.28
1.7%
12
+0.21
1.3%
161
+0.14
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
419
+0.28
0.13
136
+0.21
-0.06
12
+0.14
0.15
Negative Logits
catalogue
-1.64
cknowled
-1.63
º
-1.61
Ļª
-1.60
Ĵ
-1.60
ª
-1.53
-1.50
<|outofrange|>
-1.50
↵
-1.50
-1.50
POSITIVE LOGITS
,'"
1.73
.'"
1.73
]'
1.72
"}](#
1.49
experienced
1.48
;;
1.46
];
1.43
,'
1.43
necessarily
1.40
}(
1.40
Activations Density 4.655%