INDEX
Explanations
instances of phrases related to personal growth and self-reflection
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.17
0.5%
1013
+0.13
0.4%
1842
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.17
0.09
2044
+0.13
0.09
1013
+0.09
0.08
Negative Logits
vogli
-1.17
poichè
-1.16
sappi
-1.14
jorge
-1.07
Cfr
-1.07
javier
-1.07
ricardo
-1.06
sergio
-1.06
inol
-1.05
felipe
-1.05
POSITIVE LOGITS
.
0.68
,
0.63
stuff
0.62
transQ
0.60
and
0.60
because
0.59
really
0.59
of
0.59
,"
0.59
when
0.59
Activations Density 0.759%