INDEX
    Explanations

    expressions of desires or wishes

    expressions of regret and desires for different circumstances

    New Auto-Interp
    Negative Logits
     respectively
    -0.56
    imposed
    -0.56
     unacceptable
    -0.56
     plank
    -0.55
     intolerance
    -0.55
     fallout
    -0.55
     effectively
    -0.54
    è¦ļéĨĴ
    -0.54
    stellar
    -0.53
     violating
    -0.53
    POSITIVE LOGITS
     hadn
    0.91
     knew
    0.80
     listened
    0.78
    had
    0.77
     remembered
    0.76
     weren
    0.75
     stayed
    0.73
     didnt
    0.73
    aned
    0.71
    Had
    0.71
    Act Density 0.068%

    No Known Activations