INDEX
Explanations
This neuron fires on mentions of the Python programming language (the token “python”).
New Auto-Interp
Negative Logits
handwriting
-0.06
شاه
-0.06
äter
-0.06
direct
-0.06
decay
-0.06
082
-0.05
{%-0.05
جاد
-0.05
νό
-0.05
abe
-0.05
POSITIVE LOGITS
delve
0.07
Sho
0.06
unloaded
0.06
.DropDownStyle
0.06
StyleSheet
0.06
depending
0.06
leigh
0.06
.openapi
0.06
.comment
0.06
Posted
0.06
Activations Density 0.033%