INDEX
Explanations
asking questions
lines that pose questions or question-heading phrases (especially starting with interrogative words like who/what/how/where).
New Auto-Interp
Negative Logits
ر
0.99
ق
0.83
خت
0.77
ت
0.75
ه
0.75
ొక్క
0.74
'
0.69
رى
0.68
رخ
0.68
نت
0.68
POSITIVE LOGITS
ı
1.23
?
1.13
ă
1.11
in
0.97
?"
0.93
ພວກເຮົາ
0.93
?.
0.91
ว
0.88
acclaim
0.86
uzun
0.84
Activations Density 0.335%