INDEX
Explanations
instances where the word "somehow" appears, indicating an expression of uncertainty or unexpectedness
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.21
1.0%
1103
+0.11
0.5%
1994
+0.10
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
645
+0.21
0.02
1103
+0.11
0.02
1994
+0.10
0.02
Negative Logits
<bos>
-2.68
/***
-0.81
ⓧ
-0.69
exp
-0.55
//----
-0.54
March
-0.54
Recur
-0.54
March
-0.53
include
-0.53
<?
-0.53
POSITIVE LOGITS
signora
1.03
dott
1.02
riva
0.99
mezza
0.99
Almería
0.99
rancho
0.98
Czechos
0.96
bosco
0.95
thuy
0.94
chrysler
0.94
Activations Density 0.041%