INDEX
    Explanations

    contractions indicating negative sentiment or disbelief

    phrases expressing uncertainty or conditionality

    New Auto-Interp
    Negative Logits
    ishing
    -0.69
    arer
    -0.64
    «ĺ
    -0.62
    ÃŁ
    -0.62
    acca
    -0.61
    assing
    -0.60
    Shell
    -0.59
    active
    -0.59
     Cosponsors
    -0.58
     Tik
    -0.58
    POSITIVE LOGITS
    ĸļ
    0.78
    enance
    0.73
    uce
    0.70
     tumble
    0.69
    clinton
    0.68
    tarians
    0.67
     offend
    0.65
    arez
    0.63
     sooner
    0.63
    yip
    0.62
    Act Density 0.219%

    No Known Activations