INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    PDATE
    -0.85
    ilit
    -0.84
    querque
    -0.80
    taboola
    -0.78
    aution
    -0.78
    ilitation
    -0.77
    riter
    -0.76
     manpower
    -0.75
    udeb
    -0.75
    rador
    -0.74
    POSITIVE LOGITS
     Mae
    1.17
     Marie
    1.07
     Doe
    1.07
     Nicole
    1.06
     herself
    1.02
     Lynn
    1.01
    Anne
    1.00
     Rae
    0.99
    Marie
    0.97
     Jenner
    0.95
    Act Density 0.210%

    No Known Activations