INDEX
Explanations
dates written in the format of a year and numbers that appear to be related to military events or history
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.15
0.8%
1978
+0.11
0.6%
1937
+0.10
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1614
+0.15
0.07
1527
+0.11
0.05
1904
+0.10
0.05
Negative Logits
<bos>
-2.46
ⓧ
-0.68
-0.67
<?
-0.65
implement
-0.61
promote
-0.61
put
-0.61
prepare
-0.60
attend
-0.60
expand
-0.60
POSITIVE LOGITS
lele
1.71
maksi
1.63
seksi
1.61
saar
1.59
pank
1.50
uhr
1.47
parati
1.45
kafe
1.45
maroc
1.45
moza
1.45
Activations Density 0.141%