INDEX
Explanations
phrases related to comparison or contrast
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1334
+0.12
0.4%
605
+0.11
0.4%
889
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1805
+0.12
0.03
356
+0.11
0.02
938
+0.11
0.03
Negative Logits
trouva
-0.53
Violon
-0.51
Daarom
-0.50
Parametric
-0.47
Exponent
-0.47
Även
-0.46
DriverManager
-0.46
PLAUSE
-0.45
Bounded
-0.45
exemplaires
-0.45
POSITIVE LOGITS
sento
0.75
aspetta
0.74
fortn
0.65
resorting
0.62
trovo
0.62
rendono
0.62
accla
0.61
encomp
0.61
volunte
0.60
kyou
0.60
Activations Density 0.077%