INDEX
Explanations
sentences discussing the process of making decisions or causing actions to happen
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1793
+0.10
0.3%
674
+0.10
0.3%
680
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1793
+0.10
0.06
680
+0.10
0.04
1515
+0.10
0.04
Negative Logits
Recife
-0.59
Giugno
-0.59
Pernambuco
-0.59
gmbh
-0.58
Lombar
-0.57
stoff
-0.55
Verk
-0.55
Curitiba
-0.55
Luglio
-0.54
biograf
-0.53
POSITIVE LOGITS
Makes
0.94
Makes
0.90
make
0.90
make
0.88
Make
0.88
makes
0.88
MAKE
0.88
MAKES
0.86
makes
0.85
Make
0.83
Activations Density 0.160%