INDEX
    Explanations

    references to small or diminutive entities

    New Auto-Interp
    Negative Logits
    omaly
    -0.17
    nts
    -0.15
    ipay
    -0.15
    ulf
    -0.15
    oug
    -0.15
    /OR
    -0.15
    das
    -0.15
    acific
    -0.15
    nil
    -0.15
    uld
    -0.14
    POSITIVE LOGITS
    -known
    0.17
    iferay
    0.17
    tons
    0.17
    /small
    0.16
    john
    0.16
    agues
    0.16
    hood
    0.16
    igation
    0.15
    atur
    0.15
     bit
    0.15
    Act Density 0.034%

    No Known Activations