INDEX
    Explanations

    comparative and superlative adjectives related to goodness or benefit

    terms indicating positivity or improvement

    New Auto-Interp
    Negative Logits
    pter
    -0.81
    onut
    -0.71
    dot
    -0.69
    */(
    -0.68
    dit
    -0.68
    yss
    -0.67
    ancies
    -0.66
    abies
    -0.65
    mares
    -0.65
    ADRA
    -0.65
    POSITIVE LOGITS
    enough
    0.96
     enough
    0.93
     suited
    0.91
     Enough
    0.82
    aligned
    0.74
     correlated
    0.73
     ambassadors
    0.70
     safest
    0.70
    wired
    0.70
     aligned
    0.67
    Act Density 0.127%

    No Known Activations