INDEX
    Explanations

    references to personal traditions and holiday celebrations

    New Auto-Interp
    Negative Logits
     Hurt
    -0.15
    enstein
    -0.14
    oney
    -0.14
    zell
    -0.14
    uzey
    -0.14
    angi
    -0.14
     interchange
    -0.14
     starred
    -0.14
    ilon
    -0.14
    ýn
    -0.14
    POSITIVE LOGITS
     popcorn
    0.18
    è§Ĥçľĭ
    0.18
     watching
    0.17
     watch
    0.17
     Watches
    0.16
     watches
    0.16
     watcher
    0.16
    -watch
    0.16
    .watch
    0.16
     Watching
    0.16
    Act Density 0.180%

    No Known Activations