INDEX
    Explanations

    references to deception or authenticity in documents and communications

    New Auto-Interp
    Negative Logits
    á»ijc
    -0.15
    Insensitive
    -0.14
    orgot
    -0.14
    linger
    -0.14
     Antar
    -0.13
    declspec
    -0.13
    Unchecked
    -0.13
    нÑĸв
    -0.13
    quip
    -0.13
    _drvdata
    -0.13
    POSITIVE LOGITS
     fake
    0.69
    fake
    0.60
     Fake
    0.59
    Fake
    0.55
     faker
    0.52
     false
    0.47
    _fake
    0.47
    åģĩ
    0.45
    (fake
    0.45
    .fake
    0.43
    Act Density 0.482%

    No Known Activations