INDEX
Explanations
references to Trump and his administration's policies
New Auto-Interp
Negative Logits
arges
-0.17
arton
-0.16
parlament
-0.15
nues
-0.15
burg
-0.15
ạ
-0.14
fts
-0.14
onse
-0.14
hop
-0.14
ãĤıãģĽ
-0.14
POSITIVE LOGITS
executive
0.31
Executive
0.29
Executive
0.26
Trump
0.26
rollback
0.23
roll
0.23
resc
0.23
exec
0.22
Trump
0.21
EO
0.21
Activations Density 0.148%