INDEX
Explanations
phrases related to addressing social issues and injustices
New Auto-Interp
Negative Logits
Inventory
-0.72
Folder
-0.65
Remy
-0.64
Explosion
-0.62
Manufacturer
-0.62
pandemonium
-0.61
Farn
-0.60
Delivery
-0.60
Pair
-0.59
Peoples
-0.58
POSITIVE LOGITS
genuinely
1.02
offended
0.91
legitimately
0.91
willing
0.89
otherwise
0.89
harmed
0.89
victimized
0.86
actively
0.86
interested
0.85
impacted
0.85
Activations Density 0.188%