INDEX
    Explanations

    names of locations and people

    proper nouns and terms related to specific people and places

    New Auto-Interp
    Negative Logits
     rooting
    -0.76
     metic
    -0.75
     stump
    -0.69
     craving
    -0.64
     unsett
    -0.63
     departing
    -0.63
     roundup
    -0.63
     heav
    -0.62
     mildly
    -0.62
    ermott
    -0.62
    POSITIVE LOGITS
    gard
    1.22
    ner
    1.03
    gart
    0.97
    ners
    0.92
    alist
    0.91
    ente
    0.89
    ility
    0.88
    lasses
    0.88
    pipe
    0.88
    ens
    0.87
    Act Density 0.025%

    No Known Activations