INDEX
Explanations
commands or steps related to technical instructions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1823
+0.08
0.2%
1013
+0.08
0.2%
284
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.08
0.07
2006
+0.08
0.06
2044
+0.08
0.07
Negative Logits
incess
-0.92
notori
-0.86
sensibili
-0.86
liev
-0.85
attes
-0.80
solidar
-0.80
alkoh
-0.77
Chá
-0.76
mait
-0.75
dè
-0.75
POSITIVE LOGITS
onto
1.32
into
1.17
into
0.89
vào
0.81
Into
0.79
INTO
0.75
Into
0.74
onto
0.72
unto
0.64
alongside
0.63
Activations Density 0.568%