INDEX
    Explanations

    words related to political figures or entities

    proper nouns associated with notable entities or groups

    New Auto-Interp
    Negative Logits
    stration
    -0.71
    legram
    -0.68
    aunder
    -0.67
    amina
    -0.67
    hess
    -0.67
    ancial
    -0.67
    ript
    -0.67
    enda
    -0.66
    vironment
    -0.66
    chet
    -0.66
    POSITIVE LOGITS
     deems
    0.79
     deem
    0.70
     Enterprises
    0.68
     chose
    0.67
     chooses
    0.66
     stole
    0.66
     describes
    0.64
     assigns
    0.64
     fans
    0.63
     refers
    0.63
    Act Density 0.186%

    No Known Activations