INDEX
    Explanations

    phrases that indicate deception or manipulation in communication

    New Auto-Interp
    Negative Logits
     myſelf
    -0.93
     itſelf
    -0.84
     Efq
    -0.84
     Jefus
    -0.84
     houſe
    -0.79
     Monfieur
    -0.71
     Houſe
    -0.71
     preſent
    -0.70
    ſelf
    -0.69
     himſelf
    -0.68
    POSITIVE LOGITS
     croire
    0.77
     rằng
    0.70
     estimés
    0.63
    UrlResolution
    0.59
     AspNetCore
    0.59
     that
    0.57
     bahwa
    0.56
    qtype
    0.53
     claims
    0.52
     مب
    0.52
    Act Density 0.195%

    No Known Activations