INDEX
    Explanations

    deceptive representations and claims in political contexts

    New Auto-Interp
    Negative Logits
     mund
    -0.52
     Introduced
    -0.50
    otomy
    -0.49
     Reserv
    -0.48
    dess
    -0.47
    Teks
    -0.46
     BEAUTY
    -0.46
    kao
    -0.46
     introduce
    -0.46
    BACKUP
    -0.45
    POSITIVE LOGITS
     utafitiHapana
    0.65
     falsely
    0.65
     المعيارى
    0.65
     تانيه
    0.65
    ConstraintMaker
    0.63
    RegressionTest
    0.61
    DockStyle
    0.58
    findpost
    0.58
    참고
    0.56
     AssemblyCulture
    0.56
    Act Density 0.300%

    No Known Activations