INDEX
Explanations
scientific and philosophical terminology related to theories and controversies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
872
+0.24
0.8%
2015
+0.13
0.4%
876
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
872
+0.24
0.09
1446
+0.13
0.05
1380
+0.13
0.02
Negative Logits
CONCLUS
-0.66
RegressionTest
-0.65
IsContent
-0.65
cshtml
-0.64
***!
-0.63
__':
-0.63
UnusedPrivate
-0.63
Newswire
-0.60
getKeyCode
-0.60
OMITBAD
-0.60
POSITIVE LOGITS
affor
1.83
increa
1.72
perfet
1.71
guarante
1.69
milf
1.68
scrat
1.68
toledo
1.63
hairc
1.62
wikihow
1.62
snoopy
1.59
Activations Density 0.556%