INDEX
Explanations
instances of the word "examine" in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.29
1.7%
376
+0.20
1.2%
115
+0.18
1.1%
Correlated Neurons
Index
P. Corr.
Cos Sim.
95
+0.29
0.01
498
+0.20
0.01
345
+0.18
0.01
Negative Logits
ı
-1.49
itsu
-1.44
umab
-1.43
flourish
-1.38
></
-1.33
ksen
-1.32
ensen
-1.32
quest
-1.31
clud
-1.31
dered
-1.31
POSITIVE LOGITS
him
1.76
iqu
1.62
es
1.59
them
1.57
them
1.50
how
1.49
yg
1.44
eing
1.43
a
1.40
ldots
1.33
Activations Density 0.556%