INDEX
Explanations
references to a specific political figure named Palin
New Auto-Interp
Negative Logits
Dickens
-0.70
ires
-0.69
URI
-0.68
omething
-0.65
reek
-0.64
Increases
-0.62
ortun
-0.60
onal
-0.58
CLSID
-0.58
upon
-0.58
POSITIVE LOGITS
Palin
1.14
tera
0.80
istani
0.76
igon
0.75
aska
0.74
iband
0.73
bernatorial
0.72
EStream
0.70
imitation
0.68
sty
0.68
Activations Density 0.011%