INDEX
    Explanations

    adjectives describing negative physical conditions or outcomes

    negative descriptions of conditions or states

    New Auto-Interp
    Negative Logits
    ership
    -0.77
    inarily
    -0.75
    alist
    -0.73
    uality
    -0.73
    cript
    -0.73
    agy
    -0.73
    htaking
    -0.71
    itionally
    -0.71
    iferation
    -0.71
    iture
    -0.70
    POSITIVE LOGITS
     behaved
    0.98
     beaten
    0.85
     damaged
    0.79
     enough
    0.79
     mistaken
    0.78
    asses
    0.78
     suited
    0.77
     poisoned
    0.76
     bitten
    0.76
    needed
    0.75
    Act Density 0.021%

    No Known Activations