INDEX
    Explanations

    proper nouns that seem to be related to political or social contexts

    prominent people, organizations, and geopolitical references

    New Auto-Interp
    Negative Logits
     Crescent
    -0.50
    OURCE
    -0.49
     Curious
    -0.48
    ULL
    -0.48
    Comments
    -0.48
     unden
    -0.48
    luster
    -0.47
    ++++++++
    -0.47
    .#
    -0.47
    +.
    -0.46
    POSITIVE LOGITS
     didnt
    0.85
     hadn
    0.84
     forgot
    0.81
     could
    0.80
     doesnt
    0.78
     had
    0.78
     cannot
    0.76
     deserved
    0.75
     should
    0.75
     lacks
    0.74
    Act Density 0.585%

    No Known Activations