INDEX
    Explanations

    titles or phrases containing actions or commands

    references to songs and musical works

    New Auto-Interp
    Negative Logits
    artifacts
    -0.76
    oidal
    -0.74
     quo
    -0.72
     sidew
    -0.71
    imposed
    -0.69
    warts
    -0.69
    intent
    -0.68
     chem
    -0.68
     strengths
    -0.66
    ickr
    -0.66
    POSITIVE LOGITS
     Us
    1.12
     Them
    1.05
     Hate
    1.02
     Guys
    1.00
     Wrong
    0.97
     Love
    0.97
     Own
    0.95
     Believe
    0.94
     Dating
    0.94
     Happ
    0.93
    Act Density 0.205%

    No Known Activations