INDEX
    Explanations

    expressions of humility and related qualities

    New Auto-Interp
    Negative Logits
    thumbs
    -0.16
    villa
    -0.14
    aldo
    -0.14
    ainless
    -0.14
    ONO
    -0.14
    onte
    -0.13
     wire
    -0.13
    zilla
    -0.13
    term
    -0.13
     helm
    -0.13
    POSITIVE LOGITS
    kker
    0.16
    ardy
    0.16
    idata
    0.16
    ERRU
    0.16
    kins
    0.15
     hum
    0.15
    anka
    0.15
    Hum
    0.15
     Hum
    0.14
    isle
    0.14
    Act Density 0.014%

    No Known Activations