INDEX
Explanations
phrases related to putting something into action or emphasizing a significant impact
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1950
+0.11
0.3%
562
+0.09
0.3%
421
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
213
+0.11
0.04
562
+0.09
0.03
1950
+0.08
0.04
Negative Logits
Châ
-0.56
catég
-0.55
Mémoires
-0.54
Vaux
-0.54
'';
-0.52
akti
-0.52
Bré
-0.51
%\
-0.51
elek
-0.51
Autre
-0.51
POSITIVE LOGITS
put
0.84
puts
0.79
PUT
0.76
putting
0.76
putting
0.73
Putting
0.72
put
0.71
PUT
0.71
Put
0.69
Put
0.68
Activations Density 0.145%