INDEX
    Explanations

    proper nouns, potentially related to news articles or other text comprising of a mixture of letters and numbers

    specific acronyms or abbreviations often related to organizations or government entities

    New Auto-Interp
    Negative Logits
    hol
    -0.91
    fe
    -0.82
    iors
    -0.81
    itant
    -0.74
    aign
    -0.74
    omore
    -0.74
    ite
    -0.74
    gard
    -0.73
    hor
    -0.73
    auga
    -0.73
    POSITIVE LOGITS
    IRO
    1.66
     IMAGES
    1.30
    IRED
    1.29
    ION
    1.28
    ITAL
    1.27
    ORN
    1.25
    ELY
    1.20
    ECT
    1.19
    ATOR
    1.19
    IS
    1.18
    Act Density 0.026%

    No Known Activations