INDEX
Explanations
defining roles using "you are a"
New Auto-Interp
Negative Logits
ንም
0.46
告诉你
0.44
भैया
0.42
anskje
0.42
<unused40>
0.41
太太
0.41
<unused45>
0.41
ľudí
0.41
Occupation
0.40
धवन
0.40
POSITIVE LOGITS
chat
0.61
chat
0.58
chatbot
0.54
assisting
0.54
renowned
0.53
AI
0.52
conversational
0.51
convers
0.50
chatting
0.50
appointed
0.49
Activations Density 0.019%