INDEX
    Explanations

    abbreviations and specific codes

    proper nouns and unique identifiers, particularly those related to place names and entities

    New Auto-Interp
    Negative Logits
     Mub
    -0.71
     eleph
    -0.71
     MAD
    -0.70
     gobl
    -0.70
     Bots
    -0.69
     pione
    -0.69
     Ambro
    -0.69
    ortunately
    -0.68
     Bil
    -0.67
     Comet
    -0.67
    POSITIVE LOGITS
    gra
    0.90
    arde
    0.79
    ary
    0.79
    static
    0.74
    ARY
    0.74
    pub
    0.73
    house
    0.72
    House
    0.72
    rie
    0.72
    say
    0.72
    Act Density 0.262%

    No Known Activations