INDEX
    Explanations

    conspiratorial and politically charged language

    New Auto-Interp
    Negative Logits
     Reincarn
    -0.74
     Seym
    -0.74
     limb
    -0.73
     Stras
    -0.73
     condem
    -0.72
     tabloid
    -0.72
     Tanz
    -0.70
     Vaugh
    -0.69
     Citiz
    -0.68
     landmarks
    -0.68
    POSITIVE LOGITS
    ï¸ı
    1.25
    lean
    1.01
    company
    0.97
    vernment
    0.95
    Balt
    0.93
    lime
    0.93
    ever
    0.93
    wow
    0.92
    agree
    0.91
    pol
    0.90
    Act Density 6.015%

    No Known Activations