INDEX
Explanations
specific organizations, entities, and proper nouns mentioned in interviews or articles
names of organizations, companies, or countries
New Auto-Interp
Negative Logits
given
-0.65
*.
-0.59
.''.
-0.57
ppard
-0.54
destro
-0.54
Yug
-0.54
*/(
-0.53
aceae
-0.53
.'"
-0.51
obyl
-0.51
POSITIVE LOGITS
Media
0.65
Leaks
0.64
Observer
0.61
Newsp
0.60
Oversight
0.60
Herald
0.59
Blog
0.59
news
0.57
Channel
0.56
Privacy
0.56
Activations Density 1.136%