INDEX
    Explanations

    words related to politeness and respect

    terms related to courtesy and respectful behavior

    New Auto-Interp
    Negative Logits
    ortium
    -0.86
     Sett
    -0.62
     Price
    -0.62
     Citation
    -0.61
     Bravo
    -0.59
    resso
    -0.58
     Fargo
    -0.57
     Rx
    -0.55
     Shutterstock
    -0.55
     Heard
    -0.55
    POSITIVE LOGITS
    ous
    1.23
    ously
    1.20
    OUS
    0.98
    hing
    0.87
    anship
    0.84
    astic
    0.84
    iously
    0.81
    aunts
    0.80
    astically
    0.79
    ctory
    0.79
    Act Density 0.137%

    No Known Activations