INDEX
    Explanations

    phrases related to dishonesty or misleading information

    New Auto-Interp
    Negative Logits
    ULTS
    -0.75
    kefeller
    -0.72
    ILY
    -0.69
    NetMessage
    -0.69
     Pigs
    -0.63
    Downloadha
    -0.61
    externalActionCode
    -0.60
    jack
    -0.60
    FE
    -0.59
    ilk
    -0.59
    POSITIVE LOGITS
    ation
    2.22
    ations
    2.07
    ational
    1.83
    ative
    1.59
    ATIONS
    1.47
    ATION
    1.38
    atives
    1.36
    ated
    1.32
    ary
    1.30
    ators
    1.26
    Act Density 0.058%

    No Known Activations