INDEX
Explanations
references to tools or instruments, particularly knives
references to weapons and tools
New Auto-Interp
Negative Logits
Hilton
-0.97
billboards
-0.93
Autism
-0.89
Ads
-0.84
Airport
-0.84
Liberia
-0.83
dopamine
-0.83
NBC
-0.81
gov
-0.81
Colony
-0.80
POSITIVE LOGITS
swords
2.15
sword
2.15
blade
2.08
sword
2.05
blades
1.89
Sword
1.84
dagger
1.75
knives
1.69
Swords
1.68
Sword
1.67
Activations Density 0.282%