INDEX
    Explanations

    quantitative descriptors indicating an increase or comparison

    New Auto-Interp
    Negative Logits
    azzo
    -0.17
    olle
    -0.15
    ñana
    -0.15
    ureau
    -0.15
    omaly
    -0.15
     Hlav
    -0.14
    ours
    -0.14
    urette
    -0.14
    ovna
    -0.14
    riel
    -0.14
    POSITIVE LOGITS
     than
    0.21
     dern
    0.18
    -than
    0.18
     cazzo
    0.16
    than
    0.15
    undry
    0.15
    arness
    0.15
     handful
    0.14
    æĸ¹éĿ¢
    0.14
    /e
    0.14
    Act Density 0.062%

    No Known Activations