INDEX
    Explanations

    references to women, femininity, and gender-related topics

    New Auto-Interp
    Negative Logits
    REDACTED
    -0.86
    UFF
    -0.78
    RAY
    -0.75
    -+-+
    -0.71
    REC
    -0.70
     Flavoring
    -0.70
    raltar
    -0.70
    æĸ¹
    -0.69
    ypes
    -0.69
    rip
    -0.69
    POSITIVE LOGITS
    folk
    1.25
     empowerment
    1.04
    opausal
    0.94
     genital
    0.94
     menstru
    0.93
    hood
    0.93
     breasts
    0.92
     menstrual
    0.85
     contraceptive
    0.84
     reproductive
    0.84
    Act Density 0.347%

    No Known Activations