INDEX
Explanations
references to regional contexts or entities
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.29
1.7%
30
+0.17
1.0%
119
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
30
+0.29
0.03
503
+0.17
0.02
119
+0.13
0.01
Negative Logits
true
-1.60
&&
-1.54
rigorous
-1.47
attr
-1.45
rav
-1.41
olean
-1.39
grily
-1.38
hav
-1.37
True
-1.36
volatile
-1.36
POSITIVE LOGITS
ised
2.27
ize
2.23
izing
1.97
ising
1.94
izer
1.86
ized
1.84
izes
1.79
ise
1.74
isation
1.72
isations
1.71
Activations Density 0.057%