INDEX
Explanations
information related to a distinguished feature or characteristic, like product descriptions with notable attributes or event details with specific dates and locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
1.1%
1983
+0.14
0.8%
390
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1983
+0.19
0.04
1942
+0.14
0.03
390
+0.10
0.03
Negative Logits
<bos>
-3.20
ⓧ
-0.76
var
-0.76
public
-0.74
forChild
-0.73
const
-0.73
immer
-0.71
realize
-0.71
private
-0.70
Descripció
-0.70
POSITIVE LOGITS
maneu
2.18
affor
2.17
increa
2.10
reluct
2.07
Juf
2.06
stockholm
2.01
milf
2.01
shenan
1.94
accla
1.93
wien
1.92
Activations Density 0.069%