INDEX
Explanations
phrases related to various types of acts conducted by individuals or groups
various forms of "act" related to concepts of aggression, violence, and morality
New Auto-Interp
Negative Logits
kees
-0.79
sshd
-0.75
ials
-0.75
devices
-0.74
Lines
-0.74
Flavoring
-0.73
strands
-0.69
facets
-0.68
levels
-0.67
Transcript
-0.67
POSITIVE LOGITS
kindness
1.20
vandalism
1.14
defiance
1.09
sabotage
1.09
heroism
1.06
aggression
1.04
desperation
1.02
generosity
1.00
bravery
0.97
solidarity
0.94
Activations Density 0.060%