INDEX
    Explanations

    expressions of regret and acknowledgment of past mistakes

    New Auto-Interp
    Negative Logits
     كمان
    -0.60
     ویکی‌آمباردا
    -0.52
     fungus
    -0.49
     Bioaccumulative
    -0.48
    Patience
    -0.48
    ophones
    -0.48
     invari
    -0.47
     allo
    -0.47
    )\}$
    -0.47
     CONDITION
    -0.46
    POSITIVE LOGITS
     regretted
    0.72
     regrets
    0.68
     edit
    0.66
    VersionUID
    0.64
     Typo
    0.64
     typo
    0.64
    ynb
    0.64
     فريبيس
    0.63
    后悔
    0.63
     edits
    0.61
    Act Density 0.130%

    No Known Activations