INDEX
    Explanations

    comparative relationships between two concepts, where one concept is usually favorable or advantageous over the other

    comparative phrases that quantify improvement or decline

    New Auto-Interp
    Negative Logits
    aws
    -0.73
    sr
    -0.73
    ittees
    -0.71
    tags
    -0.71
    nar
    -0.68
    brew
    -0.67
    ptives
    -0.66
    hack
    -0.65
    kus
    -0.65
    stan
    -0.63
    POSITIVE LOGITS
    uliffe
    0.72
     likely
    0.72
     Farage
    0.68
     actu
    0.67
     healthier
    0.66
     likelihood
    0.66
     incentive
    0.66
     payoff
    0.65
     flux
    0.63
    cyclopedia
    0.63
    Act Density 0.053%

    No Known Activations