INDEX
    Explanations

    phrases indicating range or variation in quantities

    New Auto-Interp
    Negative Logits
    resy
    -0.78
    mone
    -0.73
    lees
    -0.72
    POST
    -0.72
    orsi
    -0.70
    icipated
    -0.70
    rition
    -0.69
    iquette
    -0.67
    nor
    -0.65
    itement
    -0.64
    POSITIVE LOGITS
     mildly
    0.96
     mild
    0.75
     afar
    0.73
     innocuous
    0.73
     quirky
    0.71
     thinly
    0.70
     humorous
    0.70
     benign
    0.70
     mundane
    0.69
     outright
    0.67
    Act Density 0.028%

    No Known Activations