INDEX
Explanations
phrases related to political news and legislation
references to political power and control
New Auto-Interp
Negative Logits
Shoot
-0.56
Explan
-0.53
resy
-0.52
ruck
-0.52
Stir
-0.51
Ending
-0.51
Shift
-0.50
rets
-0.49
Changes
-0.49
:=
-0.49
POSITIVE LOGITS
interstitial
0.73
).
0.62
Downloadha
0.59
.).
0.59
)."
0.58
).[
0.58
?).
0.57
})
0.56
").
0.55
]."
0.54
Activations Density 2.466%