INDEX
    Explanations

    the word "Nam" at varying activation levels

    references to specific names, particularly related to individuals and places

    New Auto-Interp
    Negative Logits
    Introduced
    -0.81
     Progressive
    -0.74
    Interview
    -0.72
    UID
    -0.67
     Subtle
    -0.65
     ECB
    -0.64
     subtitle
    -0.62
     sample
    -0.62
    ACTED
    -0.61
     Millenn
    -0.61
    POSITIVE LOGITS
    nam
    1.67
    borgh
    1.05
    ukong
    1.03
    ned
    1.03
    orously
    1.01
    emi
    0.95
    emn
    0.95
    ovember
    0.93
    icol
    0.93
    rish
    0.91
    Act Density 0.008%

    No Known Activations