INDEX
Explanations
references to articles related to biology or bioethics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
1.0%
866
+0.12
0.7%
1296
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
866
+0.17
0.03
1505
+0.12
0.03
144
+0.10
0.02
Negative Logits
<bos>
-2.92
/**
-0.93
/*
-0.82
ⓧ
-0.77
/***
-0.72
ransition
-0.72
MarshalTo
-0.65
/*++
-0.65
<?
-0.64
-0.63
POSITIVE LOGITS
unlaw
1.82
Juf
1.71
unwarran
1.64
perfon
1.62
Augu
1.62
increa
1.61
impractica
1.60
sovere
1.58
ftu
1.54
impra
1.52
Activations Density 0.075%