INDEX
    Explanations

    full names of people

    proper nouns, especially names

    New Auto-Interp
    Negative Logits
    selection
    -0.64
    akedown
    -0.63
    noon
    -0.60
    cessive
    -0.59
    Round
    -0.58
    geries
    -0.56
    berra
    -0.55
    levant
    -0.55
     intervening
    -0.55
     lockout
    -0.55
    POSITIVE LOGITS
     said
    1.14
     joked
    1.08
     exclaimed
    1.07
     says
    1.03
     explained
    1.03
     told
    1.02
     replied
    1.00
     remarked
    1.00
     laughed
    0.99
     wrote
    0.96
    Act Density 0.179%

    No Known Activations