INDEX
Explanations
mentions of prominent political figures, particularly Donald Trump and George W. Bush
New Auto-Interp
Negative Logits
jad
-0.16
.Promise
-0.16
uer
-0.15
ron
-0.15
xis
-0.14
jos
-0.14
istrib
-0.14
atis
-0.14
jav
-0.14
onth
-0.14
POSITIVE LOGITS
Tome
0.16
mpar
0.15
Buckley
0.14
ç¿Ķ
0.14
psilon
0.14
Exercise
0.14
eki
0.13
áÄį
0.13
cles
0.13
antry
0.13
Activations Density 0.116%