INDEX
    Explanations

    adjectives and verbs related to challenges, difficulties, and limitations

    references to obstacles or challenges in a societal or technological context

    New Auto-Interp
    Negative Logits
    abad
    -0.67
    enhagen
    -0.65
    icipated
    -0.63
    deen
    -0.62
    ichita
    -0.59
    facebook
    -0.57
    oway
    -0.56
    elcome
    -0.56
     Yad
    -0.55
    ilitary
    -0.54
    POSITIVE LOGITS
     ours
    0.79
     inefficient
    0.75
     detriment
    0.73
     innovate
    0.72
     harm
    0.72
     ourselves
    0.71
     destructive
    0.71
     outweigh
    0.69
     destruct
    0.68
     trivial
    0.68
    Act Density 0.985%

    No Known Activations