INDEX
    Explanations

    phrases that begin with the word "don’t" or variations of it

    New Auto-Interp
    Negative Logits
     pus
    -0.66
     DRAGON
    -0.64
    EStreamFrame
    -0.63
    EStream
    -0.62
    ħĭ
    -0.61
     Species
    -0.60
     spoiled
    -0.60
    phal
    -0.59
     Featured
    -0.59
    milo
    -0.59
    POSITIVE LOGITS
    't
    1.54
    ations
    0.92
    ned
    0.91
    ately
    0.91
    atives
    0.89
    ÃŃ
    0.87
    nel
    0.87
    ning
    0.87
    ovan
    0.87
    itely
    0.86
    Act Density 0.023%

    No Known Activations