INDEX
Explanations
phrases related to changes and plans for the future
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
663
+0.11
0.4%
130
+0.10
0.3%
755
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
663
+0.11
0.04
911
+0.10
0.04
130
+0.09
0.07
Negative Logits
perfon
-1.08
desir
-1.08
leaft
-1.06
fays
-1.05
reft
-1.05
feen
-1.05
laft
-1.02
secon
-1.02
fup
-1.02
ftre
-1.00
POSITIVE LOGITS
maraming
0.78
pagkak
0.72
betweenstory
0.67
bagay
0.63
buhay
0.63
autorytatywna
0.62
bawat
0.62
intptr
0.62
sarili
0.62
Wikimédia
0.60
Activations Density 0.825%