INDEX
    Explanations

    numeric values and dates relevant to factual events or statistics

    New Auto-Interp
    Negative Logits
    ending
    -0.16
    embros
    -0.15
    edor
    -0.15
    enger
    -0.14
    vais
    -0.14
    inding
    -0.14
    olis
    -0.14
    essel
    -0.14
    inho
    -0.14
    ubit
    -0.14
    POSITIVE LOGITS
    ADVERTISEMENT
    0.16
    orget
    0.14
    witter
    0.14
     tweet
    0.14
    tweet
    0.14
     twe
    0.14
    Tweet
    0.14
    WindowText
    0.14
    ADS
    0.14
    <\/
    0.13
    Act Density 0.007%

    No Known Activations