INDEX
Explanations
references to political appointments and affiliations
New Auto-Interp
Negative Logits
thinkable
-0.15
isay
-0.15
ÄĮer
-0.15
AFX
-0.14
Vys
-0.14
Ryder
-0.14
Bauer
-0.14
jte
-0.14
eyle
-0.14
utz
-0.14
POSITIVE LOGITS
under
0.37
Obama
0.35
Barack
0.30
Bush
0.30
Obama
0.29
under
0.28
Trump
0.27
Clinton
0.27
_under
0.25
dÆ°á»Ľi
0.24
Activations Density 0.271%