INDEX
Explanations
The neuron responds to bits of quoted speech attribution, especially the word “said” and surrounding quotation marks.
New Auto-Interp
Negative Logits
visual
-0.07
친
-0.06
_CUDA
-0.06
IPAddress
-0.06
udy
-0.06
诺
-0.06
spices
-0.06
Argentina
-0.06
touring
-0.06
/script
-0.06
POSITIVE LOGITS
:".
0.07
ре
0.06
customs
0.06
�
0.06
Derm
0.06
."+
0.06
азвание
0.06
-destruct
0.06
RM
0.06
Negative
0.06
Activations Density 0.051%