INDEX
    Explanations

    expressions of surprise or disbelief

    New Auto-Interp
    Negative Logits
    folk
    -0.15
    enger
    -0.15
    ume
    -0.14
    rb
    -0.14
    asons
    -0.14
    uento
    -0.14
    .undo
    -0.14
    alion
    -0.14
    resenter
    -0.14
    .dw
    -0.13
    POSITIVE LOGITS
     snap
    0.24
     wait
    0.22
     shoot
    0.22
     yes
    0.21
     boy
    0.20
    hk
    0.20
     bother
    0.20
     lord
    0.19
     g
    0.19
    snap
    0.18
    Act Density 0.017%

    No Known Activations