INDEX
    Explanations

    The neuron selectively activates on the word “perverted,” i.e. markers of sexual perversion.

    New Auto-Interp
    Negative Logits
    :aload
    -0.06
     fullName
    -0.06
     BoxFit
    -0.06
    Stars
    -0.05
     "'.$
    -0.05
     //'
    -0.05
     bru
    -0.05
    textTheme
    -0.05
    .StatusCode
    -0.05
    Matcher
    -0.05
    POSITIVE LOGITS
     Prov
    0.08
    eur
    0.08
    etin
    0.07
     perv
    0.07
     delivering
    0.07
    avity
    0.07
     convenience
    0.07
    within
    0.07
    BILE
    0.07
     spécial
    0.07
    Act Density 0.005%

    No Known Activations