INDEX
Explanations
proper nouns, especially related to politics and news
phrases related to political contexts and figures
New Auto-Interp
Negative Logits
Sabbath
-1.02
Weiss
-1.02
Weiss
-1.00
Wisconsin
-0.92
Somerset
-0.90
Hess
-0.89
Denver
-0.88
Hers
-0.85
Frost
-0.85
Sheffield
-0.85
POSITIVE LOGITS
Duterte
2.11
Filipino
2.03
Philippine
1.98
uterte
1.98
Manila
1.86
Philippines
1.84
Filip
1.71
Marcos
1.58
Rodrigo
1.39
Manny
1.27
Activations Density 0.288%