INDEX
Explanations
verbs related to work and productivity
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.19
0.9%
1805
+0.08
0.3%
347
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
559
+0.19
0.04
1805
+0.08
0.04
36
+0.08
0.04
Negative Logits
<bos>
-2.94
<tfoot>
-0.70
const
-0.67
public
-0.66
displayquote
-0.62
advance
-0.60
便
-0.59
})();
-0.59
座
-0.58
Lugares
-0.57
POSITIVE LOGITS
impra
1.62
disagre
1.61
reluct
1.58
tolerably
1.58
affor
1.51
swarovski
1.50
fortn
1.49
accla
1.48
unspeak
1.48
shenan
1.47
Activations Density 0.322%