INDEX
Explanations
The neuron detects when the text is introducing or labeling example/illustration sections (e.g. “some examples of…”).
New Auto-Interp
Negative Logits
ps
-0.07
_destination
-0.07
workers
-0.06
robe
-0.06
itle
-0.06
,val
-0.06
enabling
-0.06
_Show
-0.06
.`
-0.06
_additional
-0.06
POSITIVE LOGITS
®
0.07
nější
0.07
душ
0.06
Networking
0.06
иму
0.06
ابراه
0.06
rası
0.06
náklad
0.06
stmt
0.06
ethernet
0.06
Activations Density 0.017%