INDEX
Explanations
instances of the string "STRING."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
369
+0.14
0.8%
376
+0.14
0.8%
334
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
334
+0.14
0.01
232
+0.14
0.01
263
+0.13
0.01
Negative Logits
cludes
-1.62
true
-1.59
href
-1.58
cluding
-1.58
immediate
-1.58
acity
-1.56
sis
-1.54
Commons
-1.46
stitutional
-1.45
gress
-1.42
POSITIVE LOGITS
ings
1.97
ONG
1.93
ument
1.68
INGS
1.67
\.
1.61
ouin
1.61
asso
1.57
ingly
1.56
UMENT
1.56
\*,
1.55
Activations Density 0.006%