INDEX
    Explanations

    questions starting with what

    questions (especially assistant-initiated interrogative phrases like "What..." or "Anything...") that start a turn requesting input or offering help.

    New Auto-Interp
    Negative Logits
     শুধুমাত্র
    0.19
     או
    0.18
    などに
    0.17
     केवल
    0.17
     maupun
    0.17
     yalnızca
    0.17
     हालांकि
    0.17
     लंबे
    0.17
     jedynie
    0.17
    หรือไม่
    0.17
    POSITIVE LOGITS
     do
    0.27
    ?
    0.26
     motivates
    0.23
     would
    0.22
     exactly
    0.22
     did
    0.22
    0.22
    ?!
    0.21
    Exactly
    0.21
     else
    0.20
    Act Density 0.609%

    No Known Activations