INDEX
Explanations
the word "as" used in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.15
0.8%
502
+0.13
0.8%
340
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
502
+0.15
0.07
340
+0.13
0.04
123
+0.13
0.05
Negative Logits
sburg
-1.57
ctors
-1.56
excerpt
-1.52
s
-1.51
ptions
-1.50
NAME
-1.50
ered
-1.45
acks
-1.42
start
-1.41
aration
-1.41
POSITIVE LOGITS
ylum
2.10
phalt
1.95
ĥ½
1.80
opposed
1.79
ymp
1.62
)',
1.59
well
1.59
belonging
1.53
ymmetric
1.51
leep
1.50
Activations Density 0.197%