INDEX
Explanations
The neuron activates on instructional or advisory language—phrases that present steps, tips, or guidance.
New Auto-Interp
Negative Logits
Sr
-0.07
Daily
-0.06
ReLU
-0.06
ях
-0.06
.getAll
-0.06
.getBody
-0.06
edish
-0.06
activist
-0.05
beds
-0.05
.toastr
-0.05
POSITIVE LOGITS
may
0.07
Watt
0.07
_TEMPLATE
0.07
příprav
0.07
Common
0.07
-Co
0.07
enticing
0.06
्च
0.06
might
0.06
emo
0.06
Activations Density 0.037%