INDEX
    Explanations

    sentences that mention the concept of work or task completion

    New Auto-Interp
    Negative Logits
    ister
    -0.69
    clair
    -0.68
    akening
    -0.67
    arium
    -0.67
    omew
    -0.66
    epad
    -0.65
    ridge
    -0.64
    ename
    -0.63
     nonetheless
    -0.62
    wen
    -0.62
    POSITIVE LOGITS
     bells
    0.95
     fuss
    0.91
     facets
    0.88
     goodies
    0.85
     hoop
    0.83
    things
    0.80
     ingredients
    0.79
     stuff
    0.79
     usual
    0.78
     components
    0.77
    Act Density 0.638%

    No Known Activations