INDEX
    Explanations

    proper nouns and names associated with specific entities or places

    names and mentions of individuals, particularly those associated with specific stories or events

    New Auto-Interp
    Negative Logits
     smoker
    -0.77
    ========
    -0.71
    iaries
    -0.68
    hetically
    -0.68
    wered
    -0.67
    ¥ŀ
    -0.67
    indust
    -0.67
    oult
    -0.66
    avorite
    -0.66
    ictions
    -0.66
    POSITIVE LOGITS
     Ness
    0.92
     Rhodes
    0.81
     Loch
    0.80
    Afee
    0.78
    gow
    0.77
    inosaur
    0.76
    poke
    0.75
     Colossus
    0.75
    bones
    0.74
     Cro
    0.74
    Act Density 0.023%

    No Known Activations