INDEX
    Explanations

    proper nouns, specifically names like "Pere" and "Maurice"

    mentions of specific individuals' names

    New Auto-Interp
    Negative Logits
    imation
    -0.82
    ophobia
    -0.74
    ulously
    -0.73
    imates
    -0.68
    opsy
    -0.67
    ulous
    -0.66
    ablishment
    -0.66
     shame
    -0.65
    uments
    -0.65
    aughter
    -0.65
    POSITIVE LOGITS
    mallow
    0.87
    theless
    0.87
     Pere
    0.87
    tti
    0.86
    ãģ¦
    0.85
    gr
    0.78
    gone
    0.77
    ignty
    0.75
    past
    0.74
    ments
    0.74
    Act Density 0.018%

    No Known Activations