INDEX
    Explanations

    statements or phrases clarifying or emphasizing a point within a text

    phrases that express certainty or conclusions

    New Auto-Interp
    Negative Logits
     conflic
    -0.82
    ¥ŀ
    -0.76
    Tai
    -0.70
    xtap
    -0.69
     hemor
    -0.67
     notor
    -0.67
    itially
    -0.66
     cumbers
    -0.65
    estern
    -0.65
    phrine
    -0.64
    POSITIVE LOGITS
     goodbye
    1.46
     hello
    1.06
     Goodbye
    0.90
    ieu
    0.88
     aloud
    0.87
     sorry
    0.83
     farewell
    0.81
     bye
    0.78
     hi
    0.73
    ings
    0.71
    Act Density 0.048%

    No Known Activations