INDEX
    Explanations

    proper nouns, specifically names like "Elizabeth."

    New Auto-Interp
    Negative Logits
    unal
    -0.78
    awaru
    -0.76
    senal
    -0.75
    ²¾
    -0.74
    ãĥ£
    -0.74
    kefeller
    -0.71
    ursed
    -0.70
    yright
    -0.70
    acca
    -0.69
    packing
    -0.69
    POSITIVE LOGITS
     Warren
    1.03
     Howell
    0.92
     Elizabeth
    0.88
     Liu
    0.86
     Holmes
    0.86
     Taylor
    0.83
     Olsen
    0.78
     Williams
    0.77
     Tud
    0.76
     Lynn
    0.74
    Act Density 0.009%

    No Known Activations