INDEX
    Explanations

    descriptive terms or actions related to disruptive or damaging transformation processes

    terms related to degradation or decline

    New Auto-Interp
    Negative Logits
    OWS
    -0.83
    Reviewer
    -0.82
    razil
    -0.80
    glers
    -0.75
    STER
    -0.71
     Holmes
    -0.71
    aneers
    -0.69
    ONY
    -0.66
    amia
    -0.66
    intendent
    -0.65
    POSITIVE LOGITS
    ync
    0.98
     resil
    0.92
    ktop
    0.91
    irable
    0.88
    perate
    0.87
     embr
    0.86
    erve
    0.85
    semb
    0.83
    erving
    0.80
    iple
    0.79
    Act Density 0.005%

    No Known Activations