INDEX
    Explanations

    instances of fraud or deceitful activities

    New Auto-Interp
    Negative Logits
     itſelf
    -0.73
     Efq
    -0.70
     fortific
    -0.67
     intStringLen
    -0.67
    ValueStyle
    -0.66
    TagMode
    -0.66
     myſelf
    -0.64
     erec
    -0.63
     ſind
    -0.63
     <=",
    -0.63
    POSITIVE LOGITS
     scam
    1.06
     scammed
    0.96
     fooled
    0.95
     scams
    0.94
     defraud
    0.88
     fraud
    0.88
     deceived
    0.87
     deceive
    0.83
     fraude
    0.83
     cheated
    0.82
    Act Density 0.293%

    No Known Activations