INDEX
    Explanations

    proper nouns, specifically related to politics, organizations, and individuals

    specific names and titles associated with particular entities or groups

    New Auto-Interp
    Negative Logits
    font
    -0.74
    arious
    -0.68
    sbm
    -0.66
    aturday
    -0.63
    BSD
    -0.62
    olulu
    -0.62
    !:
    -0.62
     Guinness
    -0.60
    NH
    -0.60
    ;;;;
    -0.60
    POSITIVE LOGITS
     cannot
    0.91
     hadn
    0.88
     forgot
    0.85
     withdrew
    0.85
     transitioned
    0.84
     itself
    0.84
     could
    0.83
     had
    0.83
     succeeded
    0.82
     would
    0.82
    Act Density 0.637%

    No Known Activations