INDEX
Explanations
phrases related to historical events or discussions of societal issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.10
0.3%
1271
+0.10
0.3%
1343
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1271
+0.10
0.02
597
+0.10
0.01
1539
+0.10
0.01
Negative Logits
pylab
-0.80
skimage
-0.79
heapq
-0.69
pymysql
-0.65
psycopg
-0.60
pymongo
-0.59
sympy
-0.58
pym
-0.55
Simult
-0.54
zipfile
-0.53
POSITIVE LOGITS
.
0.79
..
0.72
.*
0.61
CascadeType
0.58
Bourgoin
0.56
.,
0.55
ConstraintMaker
0.53
otheby
0.52
...
0.52
/**
0.52
Activations Density 0.026%