INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Vest
    -0.09
    Buck
    -0.08
    deliver
    -0.08
    buck
    -0.08
    lack
    -0.08
    bcc
    -0.08
    jeros
    -0.08
     Evangel
    -0.08
    lady
    -0.08
    -0.08
    POSITIVE LOGITS
    (prompt
    0.09
    ^{-
    0.08
     imdb
    0.07
    0.07
    0.07
     Tex
    0.07
     pb
    0.07
    ((
    0.07
    (My
    0.07
    (separator
    0.07
    Act Density 0.001%

    No Known Activations