INDEX
    Explanations

    emotionally charged and evaluative words or phrases

    descriptive phrases expressing strong opinions or emotions

    New Auto-Interp
    Negative Logits
    rongh
    -0.82
    execute
    -0.78
    idden
    -0.78
    obook
    -0.76
    orsi
    -0.74
    govtrack
    -0.73
    artney
    -0.73
    Downloadha
    -0.73
    bara
    -0.72
    foreseen
    -0.71
    POSITIVE LOGITS
     huh
    1.21
     eh
    0.95
     tho
    0.88
     coincidence
    0.84
     kidding
    0.83
     congr
    0.81
     downside
    0.80
     ya
    0.72
    !!
    0.72
     Kills
    0.71
    Act Density 0.287%

    No Known Activations