INDEX
    Explanations

    asking clarifying questions

    New Auto-Interp
    Negative Logits
    pthread
    0.84
    Spo
    0.79
    cję
    0.77
    Arts
    0.76
    sera
    0.76
    Stre
    0.76
    ఫ్‌
    0.74
    Substring
    0.73
    npm
    0.72
    Econom
    0.71
    POSITIVE LOGITS
     questions
    1.55
     question
    1.19
     Questions
    1.15
     probing
    1.14
     permission
    1.10
     about
    1.08
     how
    1.06
     rhet
    1.03
     clarifying
    1.01
     Fragen
    1.01
    Act Density 0.022%

    No Known Activations