INDEX
    Explanations

    terms related to deception or misinformation

    references to misleading information or statements

    New Auto-Interp
    Negative Logits
    mun
    -0.80
    area
    -0.71
    âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
    -0.71
    aldo
    -0.69
    itar
    -0.69
    mega
    -0.68
    FM
    -0.66
    Merit
    -0.66
    dain
    -0.66
    ucha
    -0.66
    POSITIVE LOGITS
    ingly
    1.02
     misleading
    0.92
     misled
    0.87
     mislead
    0.85
     deceive
    0.81
     misrepresent
    0.78
     statements
    0.75
     tactics
    0.74
     excuse
    0.73
     disclosures
    0.72
    Act Density 0.015%

    No Known Activations