INDEX
    Explanations

    phrases related to conspiracy theories and extremist groups

    New Auto-Interp
    Negative Logits
    hooting
    -0.81
    Ĥİ
    -0.69
    bis
    -0.67
    HAEL
    -0.67
    Sport
    -0.65
    jen
    -0.63
    Benz
    -0.62
     Digest
    -0.62
    steps
    -0.61
     Slowly
    -0.61
    POSITIVE LOGITS
     knack
    1.25
     tendency
    1.06
     chance
    1.02
     penchant
    1.00
     vested
    0.93
     capability
    0.93
     pedigree
    0.92
     flair
    0.91
     opportunity
    0.90
     affinity
    0.89
    Act Density 4.709%

    No Known Activations