INDEX
Explanations
phrases related to political figures and events
mentions of specific organizations, news agencies, or entities related to media
New Auto-Interp
Negative Logits
osponsors
-0.72
taboola
-0.70
differe
-0.63
abor
-0.62
ordinate
-0.60
abet
-0.59
cumbers
-0.59
ité
-0.59
ospons
-0.58
anat
-0.58
POSITIVE LOGITS
11
1.02
41
1.01
19
1.00
31
0.99
17
0.98
23
0.98
39
0.98
29
0.97
21
0.97
27
0.96
Activations Density 0.044%