INDEX
    Explanations

    references to dishonesty and falsehoods in political discourse

    New Auto-Interp
    Negative Logits
    ÑĪки
    -0.17
    iba
    -0.15
    ÑĤоÑĤ
    -0.14
    еÑģи
    -0.14
     åº
    -0.14
     Stub
    -0.14
    ÑĢоÑī
    -0.14
    755
    -0.14
    _hooks
    -0.13
    uctor
    -0.13
    POSITIVE LOGITS
     Perkins
    0.14
    CJK
    0.14
    ae
    0.14
     Q
    0.14
    dex
    0.14
    _gettime
    0.13
     Oaks
    0.13
    atto
    0.13
     porr
    0.13
     inconvenience
    0.13
    Act Density 0.148%

    No Known Activations