INDEX
Explanations
direct quotes starting with the word "You"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
381
+0.15
0.5%
1510
+0.13
0.4%
1919
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1919
+0.15
0.08
30
+0.13
0.06
545
+0.11
0.05
Negative Logits
Huhu
-0.86
pagkak
-0.67
meras
-0.64
poliester
-0.62
madeus
-0.61
vettore
-0.61
capulco
-0.61
lemp
-0.60
specchio
-0.59
Terraria
-0.59
POSITIVE LOGITS
You
0.71
You
0.70
mustn
0.68
needn
0.67
you
0.66
yourself
0.64
YOU
0.63
presupposes
0.62
shouldn
0.62
Yourself
0.61
Activations Density 0.209%