INDEX
Explanations
my name, role, or objective
New Auto-Interp
Negative Logits
عدم
0.49
omitting
0.42
unwillingness
0.42
lack
0.42
你的
0.42
inability
0.41
तुम्हारा
0.41
lack
0.41
your
0.39
intuitively
0.38
POSITIVE LOGITS
motto
0.67
specialties
0.57
mantra
0.56
Motto
0.55
nickname
0.54
Background
0.49
nicknames
0.48
consists
0.47
specializes
0.47
èles
0.46
Activations Density 0.006%