INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    itars
    -0.82
    */(
    -0.80
    cffff
    -0.76
    sequence
    -0.74
    ickr
    -0.72
    asio
    -0.69
    fty
    -0.69
    oldemort
    -0.69
    herer
    -0.68
    ierrez
    -0.67
    POSITIVE LOGITS
    lene
    1.11
     Jane
    0.98
    anne
    0.96
    mount
    0.95
    anna
    0.93
    Kay
    0.92
     Beth
    0.91
    Anne
    0.90
     Sue
    0.88
     Louise
    0.88
    Act Density 0.017%

    No Known Activations