INDEX
    Explanations

    mentions of specific place names, particularly those related to Santa and associated locations

    New Auto-Interp
    Negative Logits
    ubern
    -0.18
    geh
    -0.17
    hetto
    -0.16
    ihu
    -0.16
    hin
    -0.16
    uitka
    -0.16
    ubl
    -0.15
    PRINTF
    -0.15
     Contents
    -0.15
    ack
    -0.14
    POSITIVE LOGITS
     Claus
    0.19
    clare
    0.17
    gram
    0.16
    atorium
    0.16
    com
    0.16
    angelo
    0.15
    립
    0.15
     Rosa
    0.15
     Barbara
    0.14
    eced
    0.14
    Act Density 0.012%

    No Known Activations