INDEX
    Explanations

    phrases related to having power and control

    questions that seek opinions or observations about a topic

    New Auto-Interp
    Negative Logits
     godd
    -0.72
     Godd
    -0.70
    ocide
    -0.68
    Fuck
    -0.68
    death
    -0.64
    *.
    -0.64
     Guant
    -0.63
    bush
    -0.63
     goddamn
    -0.62
    Enlarge
    -0.62
    POSITIVE LOGITS
     however
    1.10
     therefore
    0.91
     moreover
    0.91
     furthermore
    0.86
     emphas
    0.85
     meanwhile
    0.84
     also
    0.79
     especially
    0.76
     particularly
    0.74
     util
    0.74
    Act Density 1.694%

    No Known Activations