INDEX
    Explanations

    expressions of apology and SQL commands

    New Auto-Interp
    Negative Logits
     bezeichneter
    -0.82
     itſelf
    -0.68
     ―――――
    -0.67
     Efq
    -0.66
     moschino
    -0.66
     Monfieur
    -0.65
     CreateTagHelper
    -0.65
     ſever
    -0.65
     gawas
    -0.63
     ویکی‌پدیای
    -0.63
    POSITIVE LOGITS
     please
    1.18
    please
    1.08
     Please
    0.98
    Please
    0.94
    0.93
     PLEASE
    0.92
    PLEASE
    0.84
    0.82
     pleases
    0.82
     apologize
    0.81
    Act Density 0.095%

    No Known Activations