INDEX
Explanations
instances of the word "it."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.13
0.7%
190
+0.12
0.7%
383
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
352
+0.13
0.22
185
+0.12
0.18
165
+0.11
0.17
Negative Logits
ÃŃs
-2.25
ĵ
-1.90
ķ
-1.81
resents
-1.65
edes
-1.61
ĺ
-1.60
¥
-1.59
Ħ
-1.57
Ģ
-1.57
ĭ
-1.56
POSITIVE LOGITS
place
1.71
ncbi
1.64
imize
1.47
icken
1.34
ubottu
1.31
strand
1.30
amate
1.27
amo
1.27
lane
1.26
blogger
1.26
Activations Density 0.398%