INDEX
    Explanations

    words related to promoting or promoting actions

    New Auto-Interp
    Negative Logits
    */(
    -0.82
    psons
    -0.80
     ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
    -0.76
    ANG
    -0.75
    assian
    -0.75
    displayText
    -0.74
     partName
    -0.71
     Detected
    -0.71
    fuck
    -0.70
    ãĤ«
    -0.68
    POSITIVE LOGITS
     abstinence
    0.96
     atheism
    0.91
     separat
    0.90
     boycot
    0.83
     virtues
    0.81
     tolerance
    0.80
     wellness
    0.80
     unity
    0.80
     patriotism
    0.80
     authenticity
    0.79
    Act Density 0.082%

    No Known Activations