INDEX
    Explanations

    names of individuals, particularly prominent women

    New Auto-Interp
    Negative Logits
    uko
    -0.15
    mux
    -0.15
    ucer
    -0.14
    çĵľ
    -0.14
    arde
    -0.14
    RuleContext
    -0.14
    evi
    -0.14
    stad
    -0.14
    tuk
    -0.14
    stin
    -0.14
    POSITIVE LOGITS
     Ann
    0.26
    ann
    0.24
    Ann
    0.23
     ann
    0.21
    -An
    0.20
     Sue
    0.19
     Lynn
    0.19
     Anne
    0.18
    ANN
    0.17
     ANN
    0.17
    Act Density 0.041%

    No Known Activations