INDEX
Explanations
The neuron is selectively activating on all-caps C/C++ identifiers and macro names (tokens consisting of uppercase letters and underscores).
New Auto-Interp
Negative Logits
reward
-0.07
variation
-0.07
Alison
-0.07
eceğiz
-0.06
ệu
-0.06
WORK
-0.06
employers
-0.06
text
-0.06
symbols
-0.06
article
-0.06
POSITIVE LOGITS
мик
0.06
мой
0.06
ัญช
0.06
않는
0.06
htmlspecialchars
0.06
सह
0.06
NameValuePair
0.06
@d
0.06
-dd
0.06
пох
0.06
Activations Density 0.049%