INDEX
Explanations
mentions of specific historical events, particularly related to terrorist attacks
references to the September 11 attacks
New Auto-Interp
Negative Logits
Russ
-0.74
Beau
-0.73
netflix
-0.73
Gomez
-0.71
Cumber
-0.69
Grimes
-0.69
Razer
-0.66
Vaugh
-0.66
uca
-0.66
wine
-0.66
POSITIVE LOGITS
hij
1.04
attacks
1.01
Attacks
0.96
terrorist
0.95
terror
0.92
attacks
0.87
anniversary
0.87
terrorists
0.86
perpetrators
0.86
commem
0.85
Activations Density 0.049%