INDEX
Explanations
references to Donald Trump and his administration
New Auto-Interp
Negative Logits
nten
-0.18
lia
-0.15
gend
-0.15
Spiral
-0.14
Diesel
-0.14
Å¡tÄĽ
-0.14
uib
-0.14
dbg
-0.14
UTOR
-0.14
cession
-0.14
POSITIVE LOGITS
Poll
0.15
aces
0.15
352
0.15
.rad
0.14
use
0.14
LIFE
0.14
rip
0.14
ius
0.14
Joi
0.14
оно
0.14
Activations Density 0.029%