INDEX
    Explanations

    instances of the word "surprisingly"

    New Auto-Interp
    Negative Logits
     Kurt
    -0.15
    tha
    -0.15
    ge
    -0.15
     Middleton
    -0.14
    lec
    -0.14
    op
    -0.14
    hea
    -0.13
    etur
    -0.13
    upert
    -0.13
     vur
    -0.13
    POSITIVE LOGITS
    echan
    0.18
    razil
    0.17
    æķĪ
    0.17
    umlu
    0.16
    çļĦå°ı
    0.15
    ẩm
    0.15
    achi
    0.15
    nia
    0.15
    šak
    0.15
    DBG
    0.14
    Act Density 0.003%

    No Known Activations