INDEX
Explanations
phrases related to political criticism and negative evaluation
references to government decisions and their consequences
New Auto-Interp
Negative Logits
Folder
-0.70
Offline
-0.70
Finder
-0.69
Blend
-0.68
FANTASY
-0.66
Cube
-0.66
Fold
-0.64
Explorer
-0.62
Rookie
-0.61
Scrolls
-0.61
POSITIVE LOGITS
majorities
0.87
defund
0.87
Democr
0.84
perpetrated
0.81
discredited
0.79
neocons
0.78
threaten
0.75
milo
0.75
pretext
0.74
waged
0.73
Activations Density 1.495%