INDEX
Explanations
matrices
The neuron activates on fragments of LaTeX‐style math expressions (e.g. delimiters like “\$” or “\\begin”, Greek letters, matrix syntax).
New Auto-Interp
Negative Logits
661
-0.07
orn
-0.07
僕
-0.06
treff
-0.06
spi
-0.06
uomini
-0.06
Neuroscience
-0.06
бла
-0.06
wich
-0.06
§ط
-0.06
POSITIVE LOGITS
protože
0.07
ektör
0.06
astery
0.06
렵
0.06
Surface
0.06
airst
0.06
Speed
0.06
=R
0.06
董事
0.06
fats
0.06
Activations Density 0.001%