INDEX
    Explanations

    proper nouns, specifically names of people

    mentions of the name "Alex."

    New Auto-Interp
    Negative Logits
    purpose
    -0.70
    enegger
    -0.70
    recy
    -0.66
    final
    -0.65
     discouraging
    -0.64
    liness
    -0.64
    coded
    -0.64
    %%
    -0.63
    manship
    -0.63
    draft
    -0.62
    POSITIVE LOGITS
    iev
    0.89
     Anton
    0.85
    illo
    0.84
    inia
    0.82
     Koz
    0.81
     Alexander
    0.80
    andra
    0.80
    iants
    0.79
    azines
    0.78
    anian
    0.78
    Act Density 0.012%

    No Known Activations