INDEX
    Explanations

    please or informational prompts

    New Auto-Interp
    Negative Logits
     แล้ว
    0.56
     Então
    0.55
     然後
    0.55
     然后
    0.54
     rồi
    0.53
     Rồi
    0.52
    Rồi
    0.52
     Тогда
    0.52
     그러면
    0.50
     Then
    0.50
    POSITIVE LOGITS
     please
    1.09
    please
    0.98
    Please
    0.90
     Please
    0.89
    0.82
     PLEASE
    0.75
    0.74
    0.71
     pls
    0.68
     пожалуйста
    0.67
    Act Density 0.005%

    No Known Activations