INDEX
Explanations
quantities or measurements mentioned in the text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
1.0%
1964
+0.12
0.7%
1839
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1839
+0.17
0.03
1964
+0.12
0.03
1565
+0.12
0.03
Negative Logits
<bos>
-3.55
ⓧ
-1.16
-1.00
/**
-0.98
/*
-0.89
/***
-0.88
/*++
-0.81
rehabilitate
-0.78
<?
-0.78
///**
-0.74
POSITIVE LOGITS
jawa
1.41
lele
1.38
bandung
1.36
maroc
1.31
kaos
1.28
riva
1.27
Minang
1.27
kac
1.25
kase
1.22
thuy
1.22
Activations Density 0.068%