INDEX
    Explanations

    instances of admission or acknowledgment of mistakes and failures

    New Auto-Interp
    Negative Logits
    Disposed
    -0.07
    ابت
    -0.07
    ebek
    -0.07
    plet
    -0.07
    onth
    -0.07
    iev
    -0.07
    à¹Ĥลà¸ģ
    -0.07
    basePath
    -0.06
    ानन
    -0.06
    leh
    -0.06
    POSITIVE LOGITS
     defeat
    0.15
     defeated
    0.10
     defeats
    0.09
     mistakes
    0.09
     reality
    0.09
     mistake
    0.08
     ownership
    0.08
     error
    0.08
     admission
    0.08
     wrongdoing
    0.08
    Act Density 0.016%

    No Known Activations