INDEX
    Explanations

    references to motherhood and maternal figures

    New Auto-Interp
    Negative Logits
    vae
    -0.19
    ummings
    -0.17
    spect
    -0.15
    omens
    -0.15
    ataire
    -0.15
    iteur
    -0.15
    egal
    -0.14
    иÑĢов
    -0.14
    slot
    -0.14
    eec
    -0.14
    POSITIVE LOGITS
    hood
    0.52
    ly
    0.38
    land
    0.35
    -da
    0.33
    ing
    0.32
    less
    0.30
    boards
    0.28
     figure
    0.28
    liness
    0.28
    -figure
    0.28
    Act Density 0.048%

    No Known Activations