INDEX
Explanations
references to political discussions and investigations
New Auto-Interp
Negative Logits
201
-0.29
usterity
-0.25
Obama
-0.25
Obama
-0.24
tweeted
-0.23
tweet
-0.22
Û²Û°Û±
-0.22
tweeting
-0.20
-0.20
Trump
-0.20
POSITIVE LOGITS
Bush
0.40
Bush
0.37
bush
0.31
Iraq
0.28
Iraqi
0.25
Iraq
0.23
Cheney
0.23
Saddam
0.21
Kyoto
0.21
Abu
0.21
Activations Density 0.036%