INDEX
    Explanations

    phrases related to various types of acts conducted by individuals or groups

    various forms of "act" related to concepts of aggression, violence, and morality

    New Auto-Interp
    Negative Logits
    kees
    -0.79
     sshd
    -0.75
    ials
    -0.75
    devices
    -0.74
     Lines
    -0.74
     Flavoring
    -0.73
     strands
    -0.69
     facets
    -0.68
    levels
    -0.67
     Transcript
    -0.67
    POSITIVE LOGITS
     kindness
    1.20
     vandalism
    1.14
     defiance
    1.09
     sabotage
    1.09
     heroism
    1.06
     aggression
    1.04
     desperation
    1.02
     generosity
    1.00
     bravery
    0.97
     solidarity
    0.94
    Act Density 0.060%

    No Known Activations