INDEX
Explanations
words related to strength and growth
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1036
+0.09
0.3%
1416
+0.09
0.3%
143
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1265
+0.09
0.04
1416
+0.09
0.03
1036
+0.09
0.01
Negative Logits
ananas
-0.80
myn
-0.80
Keny
-0.79
Juf
-0.78
kokos
-0.76
ricardo
-0.71
karna
-0.70
sergio
-0.70
zara
-0.70
Augu
-0.69
POSITIVE LOGITS
faster
1.14
easier
1.09
better
1.08
stronger
1.08
healthier
1.08
safer
1.07
harder
1.05
quicker
1.04
happier
1.03
clearer
1.01
Activations Density 0.193%