INDEX
    Explanations

    phrases related to situations or outcomes conflicting with expectations

    repeated use of the word "despite."

    New Auto-Interp
    Negative Logits
    aird
    -0.69
    isen
    -0.67
    lees
    -0.65
    ecycle
    -0.65
    ISE
    -0.63
    isition
    -0.62
    isa
    -0.61
    aim
    -0.61
    ahime
    -0.61
    alky
    -0.61
    POSITIVE LOGITS
    math
    0.82
     acknowledging
    0.79
     having
    0.72
     knowing
    0.68
     lacking
    0.66
     seeming
    0.66
     conced
    0.65
    ĸļ
    0.64
    pite
    0.64
     surviving
    0.64
    Act Density 0.016%

    No Known Activations