INDEX
Explanations
technical terms and jargon related to technology or security
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.16
0.5%
381
+0.13
0.4%
752
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
528
+0.16
0.06
871
+0.13
0.06
1990
+0.13
0.07
Negative Logits
,
-0.81
in
-0.80
at
-0.79
for
-0.79
as
-0.77
and
-0.77
with
-0.77
to
-0.76
that
-0.75
her
-0.75
POSITIVE LOGITS
alkoh
2.20
silikon
2.11
karton
2.01
allarg
1.97
makro
1.97
kosme
1.96
kön
1.96
keramik
1.95
utop
1.92
minimalis
1.91
Activations Density 0.483%