INDEX
Explanations
numerals representing numbers within a text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
0.9%
2019
+0.07
0.4%
382
+0.06
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2034
+0.17
0.07
1922
+0.07
0.07
538
+0.06
0.05
Negative Logits
<bos>
-1.10
<?
-1.04
ⓧ
-1.03
-0.95
/**
-0.92
<?
-0.75
/*!
-0.72
/*
-0.70
#![
-0.70
rehabilitate
-0.69
POSITIVE LOGITS
lele
1.49
maroc
1.37
bandung
1.36
ananas
1.29
:");
1.28
">/
1.24
))))))))
1.22
">...
1.22
%")
1.21
thuy
1.20
Activations Density 0.250%