INDEX
Explanations
The neuron is triggered by occurrences of the word “cancel” (and its morphological variants such as “cancelation,” “cancellations,” etc.).
New Auto-Interp
Negative Logits
(obs
-0.07
ROUT
-0.07
,image
-0.06
Xunit
-0.06
Soft
-0.06
�
-0.06
Wei
-0.06
Obt
-0.06
-0.06
.conditions
-0.06
POSITIVE LOGITS
cancel
0.13
Cancel
0.12
Cancel
0.11
_cancel
0.11
cancell
0.11
cancel
0.10
cancelled
0.10
canceled
0.10
(cancel
0.10
Canceled
0.09
Activations Density 0.006%