INDEX
    Explanations

    patterns related to conspiracy theories and malicious attempts

    phrases related to conspiracy and plots

    New Auto-Interp
    Negative Logits
    standing
    -0.78
    answered
    -0.76
    aired
    -0.73
    felt
    -0.71
    amo
    -0.70
     Emerging
    -0.68
    checked
    -0.68
    enough
    -0.67
    vo
    -0.67
    apache
    -0.67
    POSITIVE LOGITS
     deceive
    1.60
     intimidate
    1.47
     undermine
    1.46
     mislead
    1.43
     assassinate
    1.42
     deprive
    1.40
     discredit
    1.39
     manipulate
    1.39
     confuse
    1.34
     punish
    1.30
    Act Density 0.286%

    No Known Activations