INDEX
    Explanations

    words that indicate emphasis or specificity, particularly in the context of descriptors

    New Auto-Interp
    Negative Logits
    anas
    -0.16
    zano
    -0.14
    isé
    -0.14
     Dum
    -0.14
    pio
    -0.14
    orne
    -0.14
    neas
    -0.14
    /epl
    -0.14
     order
    -0.14
    enta
    -0.13
    POSITIVE LOGITS
    peg
    0.18
    fuse
    0.15
    766
    0.15
    828
    0.15
    694
    0.14
     Atkins
    0.14
    PEG
    0.14
    those
    0.14
    _Helper
    0.14
     those
    0.14
    Act Density 0.028%

    No Known Activations