INDEX
Explanations
Research papers
This neuron fires on formal scientific or academic terminology and section-introduction words characteristic of technical writing.
New Auto-Interp
Negative Logits
_np
-0.07
_vm
-0.06
Bergen
-0.06
Bulgarian
-0.06
М
-0.06
Київ
-0.06
iena
-0.06
Deep
-0.06
Frames
-0.06
YLE
-0.06
POSITIVE LOGITS
átek
0.07
/|
0.06
‚ط
0.06
_TEX
0.06
salv
0.06
lement
0.06
واس
0.06
Συ
0.06
MacOS
0.06
ActiveSheet
0.06
Activations Density 0.065%