INDEX
    Explanations

    phrases related to clarification or direct statements

    expressions related to accountability and transparency in communication

    New Auto-Interp
    Negative Logits
     Created
    -0.73
     Seen
    -0.62
     Purch
    -0.61
     Transactions
    -0.61
    conservancy
    -0.61
    soDeliveryDate
    -0.59
     Located
    -0.59
     Lumpur
    -0.58
    IDs
    -0.57
    effects
    -0.56
    POSITIVE LOGITS
     sarcast
    1.07
     stating
    1.05
     explaining
    1.05
     mentioning
    1.00
     rhet
    1.00
     praising
    0.98
     apologizing
    0.98
     criticizing
    0.97
     reiter
    0.96
    :"
    0.96
    Act Density 0.348%

    No Known Activations