INDEX
    Explanations

    abbreviations/acronyms followed by a numerical value

    the end of sections or text content markers

    New Auto-Interp
    Negative Logits
    orate
    -0.85
    iard
    -0.74
    izons
    -0.71
    urate
    -0.70
    raising
    -0.63
    illard
    -0.62
    acting
    -0.61
    andel
    -0.61
    umen
    -0.61
    jamin
    -0.61
    POSITIVE LOGITS
    eways
    0.86
    zhen
    0.78
    atchewan
    0.76
    plings
    0.74
    ustain
    0.70
    cery
    0.70
    ority
    0.69
    atoon
    0.69
    utra
    0.69
    hett
    0.67
    Act Density 0.243%

    No Known Activations