INDEX
Explanations
words related to negative actions or attitudes towards others
words related to "demand" or "call for action."
New Auto-Interp
Negative Logits
WAYS
-0.80
supper
-0.69
Elves
-0.68
URN
-0.66
ORY
-0.66
Aid
-0.64
WAY
-0.64
OWS
-0.64
ORED
-0.64
Hole
-0.63
POSITIVE LOGITS
agogue
1.11
onym
1.07
ographically
1.05
agog
1.04
igration
1.02
ploy
1.01
ilit
1.00
otions
0.99
utation
0.97
ixed
0.95
Activations Density 0.008%