INDEX
    Explanations

    names of individuals or characters

    proper nouns, specifically names of individuals

    New Auto-Interp
    Negative Logits
    iculty
    -0.79
    antry
    -0.79
    istry
    -0.74
    hedral
    -0.74
    itizen
    -0.73
    imates
    -0.71
    rants
    -0.71
    ropy
    -0.70
    fulness
    -0.70
    ribute
    -0.69
    POSITIVE LOGITS
    thal
    0.81
    rities
    0.80
    ously
    0.73
    ova
    0.69
     Rove
    0.66
     vow
    0.66
    eers
    0.65
     neuron
    0.64
     Osw
    0.63
    ocal
    0.61
    Act Density 0.104%

    No Known Activations