INDEX
Explanations
news headline structures, mentioning locations and actions
news articles or reports with strong indicators of location or headlines
New Auto-Interp
Negative Logits
venge
-0.71
fing
-0.70
blinding
-0.70
aea
-0.70
rew
-0.69
habit
-0.68
unt
-0.68
fool
-0.67
ocular
-0.67
vanity
-0.67
POSITIVE LOGITS
GOODMAN
0.99
VIDE
0.91
Latest
0.90
CLUS
0.89
IMAGES
0.88
Protesters
0.84
HAEL
0.82
LAT
0.81
RELEASE
0.80
Provides
0.79
Activations Density 0.388%