INDEX
    Explanations

    defining roles using "you are a"

    New Auto-Interp
    Negative Logits
    ንም
    0.46
    告诉你
    0.44
     भैया
    0.42
    anskje
    0.42
    <unused40>
    0.41
    太太
    0.41
    <unused45>
    0.41
     ľudí
    0.41
    Occupation
    0.40
     धवन
    0.40
    POSITIVE LOGITS
    chat
    0.61
     chat
    0.58
     chatbot
    0.54
     assisting
    0.54
     renowned
    0.53
     AI
    0.52
     conversational
    0.51
     convers
    0.50
     chatting
    0.50
     appointed
    0.49
    Act Density 0.019%

    No Known Activations