INDEX
Explanations
mentions of Donald Trump
New Auto-Interp
Negative Logits
ipo
-0.16
anlı
-0.15
geber
-0.15
廳
-0.15
gers
-0.15
erna
-0.15
žen
-0.14
empl
-0.14
ipers
-0.14
ogy
-0.14
POSITIVE LOGITS
enstein
0.19
ian
0.17
ster
0.17
nell
0.17
εξ
0.17
Donald
0.16
eter
0.16
onium
0.15
ulent
0.15
oval
0.14
Activations Density 0.017%