INDEX
Explanations
dates expressed in a specific format
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
1.3%
478
+0.18
1.3%
1535
+0.14
1.0%
Correlated Neurons
Index
P. Corr.
Cos Sim.
478
+0.19
0.06
1967
+0.18
0.05
1108
+0.14
0.05
Negative Logits
<bos>
-4.06
ⓧ
-1.32
<?
-1.29
intersper
-1.11
/**
-1.08
-1.07
springfox
-1.07
/***
-1.02
disbur
-1.00
gratify
-0.96
POSITIVE LOGITS
seksi
0.71
';
0.65
vasi
0.64
corrom
0.63
{}".0.62
↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
0.62
()")
0.61
mikrofon
0.61
marea
0.60
parati
0.60
Activations Density 0.209%