INDEX
Explanations
references to entities or compounds in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
528
+0.15
0.9%
1145
+0.15
0.9%
1335
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.15
0.09
227
+0.15
0.08
1145
+0.13
0.05
Negative Logits
<bos>
-2.20
opzioni
-0.99
applicazioni
-0.93
jette
-0.90
mantenga
-0.89
versioni
-0.84
encuentre
-0.83
considère
-0.81
reconnaît
-0.81
conseguenze
-0.78
POSITIVE LOGITS
accla
1.55
uncin
1.45
maneu
1.44
unce
1.42
socie
1.42
affor
1.42
conspic
1.41
juven
1.40
reluct
1.36
unve
1.35
Activations Density 0.655%