INDEX
    Explanations

    instances of apologies or demands for apologies

    instances of apologies or expressions of regret

    New Auto-Interp
    Negative Logits
    jun
    -0.87
    weeney
    -0.85
    spot
    -0.80
    marked
    -0.74
    corn
    -0.74
    tail
    -0.71
    arnaev
    -0.68
    tails
    -0.68
    picking
    -0.66
    cop
    -0.65
    POSITIVE LOGITS
     apologize
    1.28
     apologized
    1.19
     apologise
    1.15
     apologised
    1.13
     apologizing
    1.09
     apology
    1.08
     apologies
    1.06
     sorry
    0.90
     apolog
    0.87
     pardon
    0.83
    Act Density 0.010%

    No Known Activations