INDEX
    Explanations

    words related to peace, peaceful situations, and peaceful actions

    references to peaceful interactions and protests

    New Auto-Interp
    Negative Logits
    MAC
    -0.81
    attr
    -0.79
    odor
    -0.77
    GPU
    -0.74
    olls
    -0.73
    paralle
    -0.73
    asper
    -0.72
    ANA
    -0.72
    alach
    -0.71
    ripp
    -0.71
    POSITIVE LOGITS
     peaceful
    1.03
    edIn
    0.89
    peace
    0.84
    ness
    0.81
     minded
    0.79
     peace
    0.76
     peacefully
    0.74
     Yemeni
    0.74
     resolution
    0.74
     nonviolent
    0.73
    Act Density 0.011%

    No Known Activations