INDEX
    Explanations

    asking clarifying questions

    New Auto-Interp
    Negative Logits
    roman
    0.46
    景象
    0.42
    mis
    0.40
     fascination
    0.40
    0.40
     தள
    0.39
     discussions
    0.39
     प्रयासों
    0.38
    dab
    0.38
    torn
    0.38
    POSITIVE LOGITS
     politely
    0.64
     aloud
    0.59
     уточ
    0.56
     rhet
    0.51
     öffentlich
    0.50
     clarifying
    0.49
     abiert
    0.47
     specifics
    0.47
     verbally
    0.46
     otáz
    0.46
    Act Density 0.012%

    No Known Activations