INDEX
    Explanations

    phrases indicating negative judgments or situations

    mentions of the word "poor."

    New Auto-Interp
    Negative Logits
    ategory
    -0.99
    thus
    -0.71
    CU
    -0.70
    SPONSORED
    -0.68
    theless
    -0.67
    BuyableInstoreAndOnline
    -0.67
    Pi
    -0.67
    auer
    -0.66
    Laughs
    -0.66
    natureconservancy
    -0.65
    POSITIVE LOGITS
    die
    0.92
    dies
    0.89
     quality
    0.88
     souls
    0.88
    quality
    0.84
     luck
    0.84
     sap
    0.81
     grades
    0.81
     performers
    0.79
     imitation
    0.78
    Act Density 0.036%

    No Known Activations