INDEX
Explanations
This neuron seems to activate on multiple different kinds of text including: citations starting with "L.", mathematical/scientific notation with expressions, and place names containing "park." I am unable to create a concise description of the neuron's function
Coding/legal/math text
New Auto-Interp
Negative Logits
[*]
-0.56
censiti
-0.52
DeleteBehavior
-0.51
snippetHide
-0.50
копия
-0.50
Wikimédia
-0.48
Nema
-0.48
protoimpl
-0.48
propTypes
-0.48
errHandler
-0.48
POSITIVE LOGITS
<bos>
1.97
section
0.67
</table>
0.54
//
0.52
referenties
0.50
مواليد
0.50
Today
0.49
'
0.48
vastaan
0.47
Geplaatst
0.47
Activations Density 0.356%