INDEX
Explanations
political affiliations in statements
references to political affiliations, particularly conservatives and liberals
New Auto-Interp
Negative Logits
Delivery
-0.74
Bulldogs
-0.70
Owner
-0.64
BY
-0.63
Kuala
-0.63
Ship
-0.62
inventoryQuantity
-0.59
rawdownloadcloneembedreportprint
-0.58
-0.57
Appearance
-0.57
POSITIVE LOGITS
ervatives
1.41
ervative
1.35
rejoice
0.97
paces
0.96
everywhere
0.92
aurus
0.85
alike
0.85
pace
0.85
despise
0.78
argue
0.78
Activations Density 0.070%