INDEX
Explanations
mentions of political figures, particularly President Donald Trump
New Auto-Interp
Negative Logits
nexus
-1.09
arteries
-0.94
ocr
-0.93
oned
-0.92
00007
-0.89
fman
-0.86
onal
-0.85
Reviewed
-0.84
bred
-0.83
applic
-0.82
POSITIVE LOGITS
swer
1.21
Trump
1.06
Taj
1.04
Donald
1.02
Fallon
0.99
ª
0.99
\\\\\\\\
0.99
¼
0.95
Enrique
0.94
mares
0.94
Activations Density 0.465%