INDEX
Explanations
words related to political figures, plans for the future, and official statements
New Auto-Interp
Negative Logits
quickShipAvailable
-0.76
worthy
-0.68
Explan
-0.68
ritical
-0.68
clips
-0.61
pite
-0.60
checking
-0.59
sensibilities
-0.58
Canary
-0.58
comprehension
-0.58
POSITIVE LOGITS
pursue
1.14
revise
1.13
introduce
1.09
utilize
1.09
propose
1.08
continue
1.07
publish
1.07
invest
1.07
donate
1.05
retire
1.05
Activations Density 0.096%