INDEX
Explanations
questions starting with how or what
explicit user prompts or directives—imperatives and questions that instruct the assistant to perform a task or answer a query.
New Auto-Interp
Negative Logits
azoned
0.21
!),
0.20
รวม
0.20
비롯
0.20
ഏറെ
0.20
国内外
0.19
étale
0.19
tochy
0.19
itulah
0.19
മറ്റു
0.19
POSITIVE LOGITS
𝑥
0.27
I
0.21
ኸ
0.21
میکنم
0.20
violently
0.20
したい
0.20
pharmacy
0.19
F
0.19
potassium
0.19
resistor
0.19
Activations Density 7.617%