INDEX
    Explanations

    This neuron flags explicit erotic or sexual content, particularly words referring to nudity or semi-nudity.

    New Auto-Interp
    Negative Logits
     حداقل
    -0.08
     İŞ
    -0.07
     bic
    -0.06
    ']="
    -0.06
     میزان
    -0.06
    /ws
    -0.06
    -0.06
    言って
    -0.06
    +y
    -0.06
    ِل
    -0.06
    POSITIVE LOGITS
     nude
    0.08
     Nude
    0.08
     Norse
    0.07
     distortion
    0.07
     nudity
    0.07
     German
    0.06
     Holland
    0.06
     defense
    0.06
     expose
    0.06
    StyleSheet
    0.06
    Act Density 0.004%

    No Known Activations