INDEX
Explanations
occurrences of the term "LO" which likely relates to a specific context or concept in the document
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
429
+0.15
0.9%
134
+0.11
0.6%
258
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
429
+0.15
0.02
134
+0.11
0.01
394
+0.11
0.01
Negative Logits
?"
-1.78
fections
-1.69
]>
-1.67
fection
-1.66
↵
-1.59
)](#
-1.59
rceil
-1.59
rbrace
-1.53
...?"
-1.51
"?"
-1.49
POSITIVE LOGITS
¥
2.07
ño
1.92
vre
1.88
·
1.82
ģ
1.75
gren
1.75
«
1.74
ŀ
1.71
Ľ
1.69
scape
1.67
Activations Density 0.021%