INDEX
    Explanations

    contractions and informal speech patterns like "I'll" or "can't."

    conversational phrases that indicate personal opinions or sentiments

    New Auto-Interp
    Negative Logits
    ãĢij
    -0.79
    è¦ļéĨĴ
    -0.72
    âĦ¢:
    -0.70
    âĹı
    -0.70
    ¥ŀ
    -0.66
     urgently
    -0.65
    currently
    -0.64
     ®
    -0.64
     Passage
    -0.63
    ä¸Ĭ
    -0.62
    POSITIVE LOGITS
     hindsight
    1.06
    laughs
    1.01
    Laughs
    0.91
     somebody
    0.84
     [
    0.84
    .")
    0.83
     regrets
    0.80
     embarrassed
    0.79
    .'"
    0.79
     ['
    0.78
    Act Density 0.557%

    No Known Activations