INDEX
    Explanations

    names of individuals

    proper nouns, specifically names of people

    New Auto-Interp
    Negative Logits
    cake
    -0.65
    OPS
    -0.62
     Sussex
    -0.61
     Legion
    -0.60
     Cure
    -0.60
    ually
    -0.60
     Blaze
    -0.60
    jack
    -0.60
    Word
    -0.60
    FW
    -0.59
    POSITIVE LOGITS
    quist
    1.19
    gren
    1.17
    sky
    1.05
    kson
    0.99
    qv
    0.90
    enegger
    0.90
    hetti
    0.90
    chuk
    0.89
    afort
    0.89
    ramid
    0.88
    Act Density 0.022%

    No Known Activations