INDEX
Explanations
instances of the word "obi."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
181
+0.12
0.7%
172
+0.12
0.7%
450
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
181
+0.12
0.01
172
+0.12
0.01
450
+0.11
0.01
Negative Logits
Ļª
-1.91
©
-1.88
hip
-1.57
converse
-1.50
lando
-1.48
ĨĴ
-1.46
č
-1.44
")]
-1.43
latter
-1.43
?"
-1.41
POSITIVE LOGITS
assic
1.82
fileID
1.73
omorphic
1.63
uscript
1.57
plots
1.56
scripts
1.52
istically
1.50
antry
1.49
elong
1.48
shots
1.48
Activations Density 0.005%