INDEX
    Explanations

    names of people

    names of people and notable figures

    New Auto-Interp
    Negative Logits
     targ
    -0.81
     Munich
    -0.77
    redit
    -0.73
     Dian
    -0.67
     meg
    -0.66
     phys
    -0.65
     synd
    -0.65
     hots
    -0.65
    Ds
    -0.64
     PK
    -0.64
    POSITIVE LOGITS
     Vas
    1.85
     Emily
    1.71
    Emily
    1.49
     Coffin
    1.07
     Vance
    1.04
     Coff
    1.00
     Rochester
    0.98
     Gupta
    0.94
     Caleb
    0.94
     Claire
    0.93
    Act Density 0.043%

    No Known Activations