INDEX
    Explanations

    The neuron is mainly looking for vulgar and inappropriate terms

    slang terms related to male genitalia and insults

    New Auto-Interp
    Negative Logits
     carbohydrate
    -0.78
    EV
    -0.70
     carbohydrates
    -0.68
    ocally
    -0.66
    EVA
    -0.65
    occ
    -0.64
     Ket
    -0.63
    heny
    -0.63
     sugars
    -0.62
    uries
    -0.61
    POSITIVE LOGITS
     dick
    1.22
     prick
    0.92
    holes
    0.90
     asshole
    0.87
    abase
    0.86
    hole
    0.84
    amn
    0.79
    gers
    0.79
    ometer
    0.78
    yright
    0.77
    Act Density 0.010%

    No Known Activations