INDEX
    Explanations

    names associated with prominent female figures

    New Auto-Interp
    Negative Logits
    thic
    -0.19
    edik
    -0.16
    ascus
    -0.15
    ivre
    -0.15
    _usec
    -0.15
    orks
    -0.15
    ouser
    -0.15
    ernaut
    -0.14
    Means
    -0.14
    yle
    -0.14
    POSITIVE LOGITS
    lio
    0.17
    enberg
    0.16
    tings
    0.16
    dol
    0.15
    ots
    0.15
    rib
    0.15
    gan
    0.14
    oration
    0.14
    tor
    0.14
     sick
    0.14
    Act Density 0.055%

    No Known Activations