INDEX
    Explanations

    instances where an action is being performed or a comparison is made

    phrases containing variations of the verb "do" and "did."

    New Auto-Interp
    Negative Logits
    cipled
    -0.69
    urst
    -0.63
    Published
    -0.60
    asta
    -0.60
    iken
    -0.59
     SAR
    -0.58
    ussed
    -0.58
     Provided
    -0.57
    ouple
    -0.57
    bent
    -0.57
    POSITIVE LOGITS
    pez
    0.86
    ettings
    0.72
    zed
    0.72
    aughters
    0.68
    zing
    0.67
     mosqu
    0.66
    jet
    0.57
    llor
    0.57
    Downloadha
    0.57
     whenever
    0.56
    Act Density 0.056%

    No Known Activations