INDEX
    Explanations

    proper nouns, potentially names of people, in a list

    key names of people in various contexts

    New Auto-Interp
    Negative Logits
    tenance
    -0.87
    emort
    -0.79
    iquid
    -0.76
    inarily
    -0.76
     citiz
    -0.76
    ometimes
    -0.74
     spectrum
    -0.73
    WER
    -0.72
    onym
    -0.71
     subjective
    -0.71
    POSITIVE LOGITS
    ken
    0.74
    pton
    0.74
     Juda
    0.72
     Jr
    0.68
     Bagg
    0.68
     Cul
    0.67
     Architects
    0.66
     Casting
    0.65
     joins
    0.65
     Es
    0.64
    Act Density 3.090%

    No Known Activations