INDEX
Explanations
instances of the word "it."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
371
+0.12
0.6%
253
+0.12
0.6%
238
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
352
+0.12
0.22
185
+0.12
0.18
165
+0.11
0.17
Negative Logits
Forward
-1.70
EXPORT
-1.70
conversion
-1.56
vir
-1.45
future
-1.45
hal
-1.43
~/
-1.40
printStackTrace
-1.40
{})-1.40
compat
-1.38
POSITIVE LOGITS
ting
1.65
anges
1.57
decree
1.57
opter
1.56
hes
1.55
osaurs
1.53
ties
1.49
dream
1.49
hesis
1.49
thing
1.44
Activations Density 0.394%