INDEX
    Explanations

    expressions of apology or regret

    New Auto-Interp
    Negative Logits
    imonial
    -0.16
     migrationBuilder
    -0.15
    atables
    -0.15
    uye
    -0.15
    arga
    -0.15
    _PP
    -0.15
    urette
    -0.15
    egie
    -0.14
    entionPolicy
    -0.14
    eatures
    -0.14
    POSITIVE LOGITS
     apologies
    0.27
     apologize
    0.27
     apologized
    0.27
     apology
    0.26
     apolog
    0.26
     Ap
    0.26
    ap
    0.22
    Ap
    0.22
     regrets
    0.21
     sorry
    0.21
    Act Density 0.049%

    No Known Activations