INDEX
    Explanations

    negative words associated with deception and misinformation

    references to misinformation and its effects

    New Auto-Interp
    Negative Logits
     GOODMAN
    -0.86
    ufact
    -0.79
    natureconservancy
    -0.78
    hesion
    -0.74
    Temperature
    -0.72
    ederation
    -0.72
    entary
    -0.71
    bridge
    -0.70
    gio
    -0.69
    pection
    -0.68
    POSITIVE LOGITS
     perpetrated
    1.03
     slander
    0.96
     insin
    0.92
     baseless
    0.91
     disinformation
    0.91
     accusations
    0.91
     misinformation
    0.90
     falsehood
    0.88
     bigotry
    0.88
     bigot
    0.88
    Act Density 0.778%

    No Known Activations