INDEX
    Explanations

    words related to failures or negative outcomes

    terms related to failures or negative outcomes

    New Auto-Interp
    Negative Logits
    ingham
    -0.75
    othermal
    -0.74
    venants
    -0.74
    trak
    -0.71
    atures
    -0.71
    ignty
    -0.68
    otine
    -0.68
     weights
    -0.66
    types
    -0.66
    akens
    -0.66
    POSITIVE LOGITS
    erella
    0.82
     mishand
    0.78
    itous
    0.78
     disastrous
    0.77
    ilton
    0.74
     bung
    0.73
     botched
    0.72
     miser
    0.72
     Spac
    0.71
     Ukrain
    0.69
    Act Density 0.026%

    No Known Activations