INDEX
Explanations
numbers with words describing quantities or quantifiers
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1967
+0.15
0.5%
50
+0.14
0.5%
2019
+0.14
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
369
+0.15
0.03
177
+0.14
0.03
16
+0.14
0.04
Negative Logits
unspeak
-0.72
<bos>
-0.69
relenting
-0.69
ambiguation
-0.67
juges
-0.63
withal
-0.63
indescri
-0.59
pettico
-0.59
nachher
-0.59
nobly
-0.59
POSITIVE LOGITS
kram
0.93
makro
0.92
plak
0.90
logis
0.90
ideolog
0.90
kac
0.88
praktik
0.87
zub
0.87
akut
0.85
ohr
0.85
Activations Density 0.170%