INDEX
    Explanations

    proper nouns referring to people's names

    variations of the word "ather."

    New Auto-Interp
    Negative Logits
    aepernick
    -0.70
     unreal
    -0.62
    ako
    -0.61
    hao
    -0.61
    sure
    -0.60
    apo
    -0.59
     Bravo
    -0.59
    lez
    -0.59
    elman
    -0.59
     Played
    -0.58
    POSITIVE LOGITS
    apy
    1.04
    ivities
    1.03
    sburg
    1.03
    ings
    0.86
    s
    0.84
    weights
    0.82
    aith
    0.79
    atform
    0.78
    athering
    0.78
    illac
    0.78
    Act Density 0.026%

    No Known Activations