INDEX
Explanations
comparisons using the word "as" followed by a noun phrase
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.13
0.4%
528
+0.11
0.4%
1334
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1334
+0.13
0.07
341
+0.11
0.07
528
+0.11
0.07
Negative Logits
hoga
-0.76
karna
-0.73
kade
-0.71
gmbh
-0.68
saha
-0.68
bera
-0.68
labd
-0.68
ervations
-0.66
sirup
-0.65
jawa
-0.65
POSITIVE LOGITS
as
0.80
As
0.77
as
0.73
Să
0.68
AS
0.68
As
0.68
Și
0.67
Sebagai
0.65
<bos>
0.63
étroite
0.59
Activations Density 0.266%