INDEX
    Explanations

    terms related to awe and wonder

    terms related to gender, specifically emphasizing the concept of "woman."

    New Auto-Interp
    Negative Logits
    ebook
    -0.63
    packing
    -0.61
     screen
    -0.60
     ENTER
    -0.60
     Hampton
    -0.59
    OUT
    -0.59
    sheet
    -0.57
    https
    -0.57
     pasture
    -0.56
     Beautiful
    -0.56
    POSITIVE LOGITS
    omen
    1.26
    ovic
    0.86
    stru
    0.85
    oshenko
    0.84
    opol
    0.83
    nant
    0.83
    oso
    0.83
    eday
    0.82
    iak
    0.81
    nown
    0.81
    Act Density 0.005%

    No Known Activations