INDEX
    Explanations

    names, particularly the name "Pulitzer" with varying degrees of activation

    New Auto-Interp
    Negative Logits
    nesota
    -0.75
    con
    -0.66
    edit
    -0.65
    skirts
    -0.65
    haps
    -0.63
    angered
    -0.63
    versions
    -0.61
    spring
    -0.61
     segreg
    -0.61
    termination
    -0.59
    POSITIVE LOGITS
    itzer
    1.36
    enegger
    0.90
    zman
    0.85
    mann
    0.82
    sonian
    0.80
    rod
    0.79
    gerald
    0.77
     Bros
    0.76
    baum
    0.76
     MacArthur
    0.75
    Act Density 0.004%

    No Known Activations