INDEX
Explanations
the word "such" in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.27
1.4%
1828
+0.16
0.8%
131
+0.14
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
131
+0.27
0.06
1828
+0.16
0.06
755
+0.14
0.05
Negative Logits
<bos>
-2.81
get
-0.72
/***
-0.69
John
-0.69
get
-0.68
code
-0.68
//
-0.67
left
-0.67
contentLoaded
-0.66
India
-0.66
POSITIVE LOGITS
ecru
2.00
affor
1.97
maneu
1.96
wikihow
1.91
jorge
1.87
increa
1.86
hairc
1.85
milano
1.84
impra
1.83
roberto
1.83
Activations Density 0.045%