INDEX
    Explanations

    hypocrisy in arguments or statements, particularly when there is a disconnect between actions and stated beliefs

    New Auto-Interp
    Negative Logits
     Jefus
    -0.77
    ientôt
    -0.75
     Efq
    -0.71
    ffions
    -0.70
     houſe
    -0.70
    ftant
    -0.69
     pleaſure
    -0.69
     myſelf
    -0.69
    Gizmos
    -0.68
    iffion
    -0.68
    POSITIVE LOGITS
    <bos>
    0.60
    文中
    0.54
    참고
    0.49
     concluding
    0.49
     discusses
    0.46
     conclude
    0.46
    id
    0.45
     detailed
    0.45
    subsubsection
    0.44
     authors
    0.44
    Act Density 0.916%

    No Known Activations