INDEX
    Explanations

    references to social etiquette and courtesy

    New Auto-Interp
    Negative Logits
    olla
    -0.14
    127
    -0.14
    owi
    -0.13
    wend
    -0.13
    /is
    -0.13
    icc
    -0.13
     stren
    -0.12
    arel
    -0.12
    ноÑģÑĤ
    -0.12
     incons
    -0.12
    POSITIVE LOGITS
     courtesy
    0.52
     Courtesy
    0.46
    Courtesy
    0.42
     etiquette
    0.42
     manners
    0.42
     civ
    0.41
     courteous
    0.41
     polite
    0.38
    礼
    0.37
     polit
    0.37
    Act Density 0.531%

    No Known Activations