INDEX
    Explanations

    comparisons where one thing is deemed to be superior or preferable to another

    comparative phrases expressing something being better or worse than something else

    New Auto-Interp
    Negative Logits
    encers
    -0.86
    arta
    -0.84
    ãĤ¼ãĤ¦ãĤ¹
    -0.78
    rones
    -0.74
    united
    -0.74
    bard
    -0.72
    iencies
    -0.71
    assadors
    -0.71
    anta
    -0.71
    pter
    -0.70
    POSITIVE LOGITS
     having
    1.01
     pretending
    1.00
     relying
    0.99
     anything
    0.99
     putting
    0.98
     messing
    0.97
     slapping
    0.96
     guessing
    0.96
     letting
    0.96
     deleting
    0.93
    Act Density 0.095%

    No Known Activations