INDEX
    Explanations

    apologies or instances of someone publicly expressing regret for their actions

    instances of public apologies

    New Auto-Interp
    Negative Logits
    Downloadha
    -0.84
    corn
    -0.71
    arnaev
    -0.70
    aida
    -0.69
    uana
    -0.69
    weeney
    -0.68
    production
    -0.66
    tails
    -0.66
    adj
    -0.66
    eu
    -0.65
    POSITIVE LOGITS
     unres
    1.28
     apologized
    1.07
     apologize
    1.03
     sincerely
    1.02
     prof
    0.99
     apology
    0.97
    giving
    0.91
     apologise
    0.90
     apologizing
    0.90
     apologised
    0.90
    Act Density 0.048%

    No Known Activations