INDEX
    Explanations

    the phrase "what's going on."

    New Auto-Interp
    Negative Logits
    ritical
    -0.78
    eah
    -0.76
    onso
    -0.74
    ply
    -0.72
    ĵ
    -0.72
    yes
    -0.71
    prise
    -0.69
    serv
    -0.69
    oning
    -0.69
    attery
    -0.69
    POSITIVE LOGITS
     unfolding
    0.92
     inside
    0.91
     behind
    0.85
     underneath
    0.82
     unfold
    0.81
     here
    0.79
     elsewhere
    0.79
    inside
    0.78
     backstage
    0.78
     upstairs
    0.78
    Act Density 0.033%

    No Known Activations