INDEX
    Explanations

    phrases related to moral judgements, particularly the concept of evil

    instances of the word "evil" and its associated contexts

    New Auto-Interp
    Negative Logits
    ribe
    -0.78
    Lago
    -0.78
    PsyNetMessage
    -0.77
    ribes
    -0.76
    illation
    -0.75
    UNCH
    -0.73
    lov
    -0.71
    aro
    -0.71
    RESULTS
    -0.71
    drops
    -0.70
    POSITIVE LOGITS
     incarn
    1.04
     evil
    0.96
     mastermind
    0.87
     enemy
    0.87
     villain
    0.86
     genius
    0.86
     twin
    0.85
     undermin
    0.85
     adversary
    0.84
     deed
    0.83
    Act Density 0.013%

    No Known Activations