INDEX
    Explanations

    references to gender and sexual identity

    New Auto-Interp
    Negative Logits
    (ib
    -0.16
     shoulder
    -0.15
    aru
    -0.15
     tooth
    -0.15
    коÑĢиÑģÑĤ
    -0.15
    çīĻ
    -0.15
    vit
    -0.14
    eting
    -0.14
    jaw
    -0.14
     Verb
    -0.14
    POSITIVE LOGITS
     fores
    0.27
     pub
    0.27
     vag
    0.27
     ure
    0.26
     vul
    0.24
     pen
    0.24
     penis
    0.23
     erect
    0.21
     hym
    0.21
    Pen
    0.21
    Act Density 0.037%

    No Known Activations