INDEX
    Explanations

    phrases related to contrasting good and bad situations

    phrases that compare positive and negative aspects

    New Auto-Interp
    Negative Logits
    eters
    -0.82
     veins
    -0.73
    agne
    -0.71
    sbm
    -0.70
    rontal
    -0.68
    inez
    -0.68
    artment
    -0.67
    iard
    -0.66
    quit
    -0.66
    ttes
    -0.66
    POSITIVE LOGITS
     brightest
    0.94
     Powerful
    0.81
     bad
    0.79
     prosperous
    0.79
     evil
    0.78
     honorable
    0.77
    Evil
    0.75
     indifferent
    0.75
     mighty
    0.74
     equitable
    0.74
    Act Density 0.197%

    No Known Activations