INDEX
    Explanations

    references to the 9/11 attacks and related conspiracy theories

    New Auto-Interp
    Negative Logits
    itably
    -0.75
    iating
    -0.71
    iated
    -0.71
    holders
    -0.70
    ivari
    -0.70
    iator
    -0.67
    itable
    -0.67
    enture
    -0.66
    pmwiki
    -0.66
    iations
    -0.65
    POSITIVE LOGITS
    9999
    1.36
    999
    1.27
    06
    1.22
    090
    1.20
    07
    1.14
    08
    1.12
    03
    1.10
    04
    1.08
    02
    1.04
    09
    1.02
    Act Density 0.065%

    No Known Activations