INDEX
Explanations
mathematical and technical terms including numbers and symbols
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
394
+0.27
1.1%
50
+0.26
1.0%
964
+0.14
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
394
+0.27
0.20
964
+0.26
0.14
50
+0.14
0.30
Negative Logits
TagMode
-0.97
<bos>
-0.96
IsContent
-0.85
""],
-0.82
Drapeau
-0.81
NKC
-0.80
DoubleQuotes
-0.79
Wikispecies
-0.79
FBref
-0.78
typelib
-0.77
POSITIVE LOGITS
impra
1.60
maneu
1.45
reluct
1.44
affor
1.44
unden
1.42
inev
1.42
uninten
1.39
increa
1.38
Rine
1.37
shenan
1.36
Activations Density 12.898%