INDEX
    Explanations

    instances where someone is being accused of deception

    instances of the word "lying" to indicate dishonesty or falsehood

    New Auto-Interp
    Negative Logits
    Ultra
    -0.77
    ugal
    -0.72
    FN
    -0.72
    ORE
    -0.72
    entry
    -0.71
    era
    -0.71
    Effective
    -0.71
    aud
    -0.71
    ISO
    -0.70
    aldi
    -0.70
    POSITIVE LOGITS
     lying
    0.94
     horizont
    0.89
     liar
    0.80
     lie
    0.79
     skelet
    0.78
     seiz
    0.77
     lied
    0.77
     pills
    0.75
     mortg
    0.74
     camoufl
    0.73
    Act Density 0.006%

    No Known Activations