INDEX
Explanations
references to economic concepts and discussions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.28
1.6%
1870
+0.16
0.9%
90
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1777
+0.28
0.04
47
+0.16
0.04
90
+0.10
0.03
Negative Logits
<bos>
-2.86
distribute
-0.59
retain
-0.58
set
-0.58
maintain
-0.58
invade
-0.57
ActionMode
-0.57
ver
-0.57
-0.57
introduce
-0.57
POSITIVE LOGITS
Minang
1.45
bandung
1.45
jaya
1.44
jawa
1.39
paradiso
1.38
véhic
1.37
🤣🤣
1.36
carrefour
1.30
Augu
1.28
jati
1.27
Activations Density 0.037%