INDEX
Explanations
verbs related to taking action or making decisions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1805
+0.14
0.5%
1581
+0.12
0.4%
25
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1805
+0.14
0.03
1565
+0.12
0.02
25
+0.11
0.03
Negative Logits
<bos>
-0.49
kanal
-0.43
chì
-0.43
cove
-0.42
Tikang
-0.42
ibal
-0.42
Zap
-0.42
Rihanna
-0.41
jagung
-0.41
Lar
-0.41
POSITIVE LOGITS
acting
0.99
act
0.92
acted
0.91
Acting
0.90
pymysql
0.88
acts
0.86
acting
0.86
chrysler
0.85
ACTS
0.85
Acts
0.84
Activations Density 0.067%