INDEX
    Explanations

    addressing masters and royalty

    New Auto-Interp
    Negative Logits
     interesante
    0.47
     interesting
    0.47
     intriguing
    0.46
     arkadaşlar
    0.44
     sympathique
    0.44
     интерес
    0.43
     guys
    0.43
     좋아
    0.43
     લોકોને
    0.42
     интересных
    0.42
    POSITIVE LOGITS
     humbly
    0.98
     humble
    0.93
     humild
    0.72
     Humble
    0.71
    陛下
    0.68
     servant
    0.65
     unworthy
    0.65
    Master
    0.63
    🙇
    0.63
     respectfully
    0.62
    Act Density 0.008%

    No Known Activations