INDEX
Explanations
references to political events and government actions
New Auto-Interp
Negative Logits
Quantity
-0.80
..."
-0.71
trop
-0.69
crappy
-0.68
decency
-0.68
______
-0.66
shitty
-0.65
Fuck
-0.65
nig
-0.65
crap
-0.64
POSITIVE LOGITS
meanwhile
1.06
reportedly
1.05
prompted
1.03
also
1.02
however
0.92
bolstered
0.87
additionally
0.85
spokeswoman
0.85
furthermore
0.84
spokesman
0.84
Activations Density 9.481%