INDEX
    Explanations

    references to conspiracy theories and related skepticism

    New Auto-Interp
    Negative Logits
    hev
    -0.17
    çŃĨ
    -0.15
    ehr
    -0.15
    ijd
    -0.14
    Ậ
    -0.14
     Beit
    -0.14
    446
    -0.14
    Zen
    -0.14
     Cool
    -0.14
    461
    -0.14
    POSITIVE LOGITS
     inan
    0.18
    exion
    0.15
    adolu
    0.14
    alam
    0.13
    COPE
    0.13
    utin
    0.13
    ureau
    0.13
    chest
    0.13
    rawer
    0.13
    ýt
    0.13
    Act Density 0.100%

    No Known Activations