INDEX
    Explanations

    expressions related to manners, specifically focusing on concepts of rudeness and politeness

    terms related to rudeness and politeness

    New Auto-Interp
    Negative Logits
    lisher
    -0.86
    ernels
    -0.83
    ishop
    -0.75
    panel
    -0.75
    hart
    -0.74
    razil
    -0.73
    ARK
    -0.71
    lished
    -0.68
    ilation
    -0.68
    yrinth
    -0.68
    POSITIVE LOGITS
     rude
    0.93
     rud
    0.87
     etiquette
    0.87
     awakening
    0.81
     polite
    0.81
     manners
    0.79
     disrespectful
    0.78
     disrespect
    0.78
     greeting
    0.76
     respectfully
    0.73
    Act Density 0.041%

    No Known Activations