INDEX
Explanations
news articles discussing specific events or reports
New Auto-Interp
Negative Logits
sacrific
-0.81
BILITIES
-0.76
atics
-0.73
discriminate
-0.71
iva
-0.67
disagree
-0.65
anka
-0.63
pert
-0.62
perceive
-0.62
discharged
-0.61
POSITIVE LOGITS
Scrib
0.99
NPR
0.95
Wikileaks
0.91
Pastebin
0.90
Billboard
0.87
Archives
0.86
Collider
0.86
February
0.85
Newsweek
0.85
Politico
0.84
Activations Density 1.480%