INDEX
    Explanations

    phrases expressing degree or intensity, particularly the word "little."

    New Auto-Interp
    Negative Logits
     Dise
    -0.19
    loat
    -0.16
    ivi
    -0.16
    ymi
    -0.15
    ohana
    -0.15
    top
    -0.14
    oyer
    -0.14
    etur
    -0.14
    ienne
    -0.14
    Thumb
    -0.14
    POSITIVE LOGITS
    IPP
    0.16
    IMA
    0.15
    xcd
    0.15
     Stark
    0.14
    dea
    0.14
    ipers
    0.14
    йн
    0.14
    clearfix
    0.14
    ernals
    0.14
    FDA
    0.13
    Act Density 0.022%

    No Known Activations