INDEX
Explanations
The neuron activates on occurrences of the word “indirect” (e.g. as a section heading or label).
New Auto-Interp
Negative Logits
Kelly
-0.07
chore
-0.07
izioni
-0.06
cram
-0.06
CELER
-0.06
Daddy
-0.06
farms
-0.06
oun
-0.06
caling
-0.06
clans
-0.06
POSITIVE LOGITS
indirect
0.09
Indicator
0.08
ou
0.07
�
0.07
indirectly
0.07
enclosed
0.07
�
0.07
.toDouble
0.07
اغ
0.07
remote
0.07
Activations Density 0.005%