INDEX
    Explanations

    proper nouns, specifically names of individuals

    New Auto-Interp
    Negative Logits
    tml
    -0.70
    ribes
    -0.64
    mble
    -0.63
    aughed
    -0.61
    aughtered
    -0.61
    ometimes
    -0.60
    semble
    -0.58
     [|
    -0.57
    sed
    -0.55
    mits
    -0.55
    POSITIVE LOGITS
    's
    1.19
     joining
    1.10
     being
    1.09
     behaving
    1.08
     becoming
    1.06
     quitting
    1.02
     stealing
    1.02
     disappearing
    1.00
     marrying
    1.00
     raping
    1.00
    Act Density 0.344%

    No Known Activations