INDEX
Explanations
mentions or discussions related to academic or technical content, possibly with a focus on specific concepts or methodologies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.12
0.3%
1870
+0.10
0.3%
1639
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.12
0.07
1639
+0.10
0.04
885
+0.09
0.05
Negative Logits
sappi
-1.28
dises
-1.25
vogli
-1.20
mef
-1.18
solidar
-1.17
abbra
-1.14
abr
-1.14
„,
-1.11
gius
-1.10
ordina
-1.09
POSITIVE LOGITS
by
1.03
into
0.84
with
0.74
by
0.69
according
0.68
to
0.67
through
0.67
przez
0.66
extensively
0.65
against
0.64
Activations Density 0.374%