INDEX
    Explanations

    phrases related to misinformation and manipulation for political or personal gain

    instances of deception or misleading information

    New Auto-Interp
    Negative Logits
    atonin
    -0.81
    ipeg
    -0.80
    ridor
    -0.79
    ixel
    -0.78
    cade
    -0.78
    ftime
    -0.76
    utra
    -0.76
    pring
    -0.75
     Aires
    -0.75
    enhagen
    -0.73
    POSITIVE LOGITS
     misplaced
    1.09
     incompetent
    1.08
     deceit
    1.07
     misunderstood
    1.07
     inadequ
    1.05
     misrepresent
    1.03
     inept
    0.99
     illeg
    0.98
     unworthy
    0.98
     misleading
    0.98
    Act Density 0.355%

    No Known Activations