INDEX
    Explanations

    references to prominent female figures, specifically actresses

    New Auto-Interp
    Negative Logits
     str
    -0.19
     du
    -0.16
    utton
    -0.16
    etri
    -0.16
     ly
    -0.15
    jes
    -0.15
    eri
    -0.15
     Madden
    -0.15
    ugas
    -0.15
     fraction
    -0.15
    POSITIVE LOGITS
     htmlentities
    0.16
    DM
    0.16
     secret
    0.15
    _banner
    0.15
    topl
    0.15
    ë´ī
    0.15
    -banner
    0.15
    leigh
    0.15
    emax
    0.14
    izia
    0.14
    Act Density 0.034%

    No Known Activations