INDEX
    Explanations

    This neuron detects references to hormone-based gender transition toward the opposite sex.

    New Auto-Interp
    Negative Logits
    ávají
    -0.07
     Acer
    -0.07
    hek
    -0.07
     fue
    -0.07
     Poh
    -0.06
     Benton
    -0.06
     Pew
    -0.06
     Natur
    -0.06
    олаг
    -0.06
     stř
    -0.06
    POSITIVE LOGITS
     Aqu
    0.06
     urb
    0.06
    ">'.
    0.06
    ='')
    0.06
    MRI
    0.06
    06
    0.06
     Πολι
    0.06
    isor
    0.05
    (ws
    0.05
     pleasures
    0.05
    Act Density 0.268%

    No Known Activations