INDEX
    Explanations

    expressions of humility or references to being humble

    New Auto-Interp
    Negative Logits
    adem
    -0.17
    oci
    -0.17
    heel
    -0.15
    ampion
    -0.15
    allen
    -0.15
    p
    -0.15
    upo
    -0.14
    pcf
    -0.14
    ogn
    -0.14
    pour
    -0.14
    POSITIVE LOGITS
     hum
    0.43
     Hum
    0.40
    Hum
    0.38
    pty
    0.30
    hum
    0.29
    iliated
    0.25
    mers
    0.24
    iliate
    0.24
    ankind
    0.23
    mock
    0.23
    Act Density 0.011%

    No Known Activations