INDEX
    Explanations

    proper nouns, specifically names of people or places

    New Auto-Interp
    Negative Logits
    ivity
    -0.69
     Rivals
    -0.69
    ivities
    -0.66
    IFIED
    -0.65
    assets
    -0.64
    atever
    -0.62
    atform
    -0.61
    iffs
    -0.60
    ITY
    -0.60
    ively
    -0.60
    POSITIVE LOGITS
    leck
    0.81
    enthal
    0.78
    erm
    0.76
    elf
    0.76
    hal
    0.75
    eer
    0.75
    ering
    0.74
    e
    0.73
    erman
    0.72
    pend
    0.72
    Act Density 0.044%

    No Known Activations