INDEX
    Explanations

    proper nouns representing individuals

    names of notable individuals along with their achievements or roles

    New Auto-Interp
    Negative Logits
    .",
    -0.82
    !".
    -0.75
    ".[
    -0.72
    ".
    -0.63
    `.
    -0.63
    %.
    -0.63
    '."
    -0.61
    .""
    -0.61
     attRot
    -0.61
    .:
    -0.60
    POSITIVE LOGITS
    *)
    0.71
     acronym
    0.59
    pires
    0.58
    )|
    0.55
    ?)
    0.54
     Lloyd
    0.54
    )
    0.53
     ?)
    0.52
     umbrella
    0.51
     )]
    0.51
    Act Density 1.785%

    No Known Activations