INDEX
    Explanations

    references to lying or deceitful behavior

    New Auto-Interp
    Negative Logits
    Décès
    -0.69
     рады
    -0.66
     Allentown
    -0.63
    évaluateur
    -0.63
     Câmara
    -0.62
    hadiran
    -0.61
     McIn
    -0.60
     antemano
    -0.59
    {\
    -0.59
     Maynard
    -0.59
    POSITIVE LOGITS
     lie
    2.43
     lies
    2.21
     lying
    2.06
     LIE
    2.00
     Lie
    1.93
     Lies
    1.82
     Lying
    1.80
    Lies
    1.79
    Lying
    1.74
    Lie
    1.70
    Act Density 0.070%

    No Known Activations