INDEX
    Explanations

    mentions of failure or lack of success

    instances of the word "failure" in various contexts

    New Auto-Interp
    Negative Logits
    utra
    -0.81
    selves
    -0.76
    enfranch
    -0.75
    rete
    -0.69
    othy
    -0.67
    riel
    -0.66
     Sed
    -0.66
    enta
    -0.63
    ople
    -0.62
    atu
    -0.62
    POSITIVE LOGITS
     miser
    1.20
     failures
    0.82
    DEV
    0.80
    Failure
    0.78
     dism
    0.77
    lust
    0.74
    ulence
    0.73
    luster
    0.73
     failure
    0.71
    fail
    0.70
    Act Density 0.030%

    No Known Activations