INDEX
    Explanations

    transitional words indicating a contrast or contradiction

    occurrences of the word "Yet."

    New Auto-Interp
    Negative Logits
    tained
    -0.80
    tein
    -0.71
    omial
    -0.70
    edu
    -0.69
    packs
    -0.69
    atti
    -0.67
    rities
    -0.66
    ancial
    -0.65
    sword
    -0.64
    cases
    -0.62
    POSITIVE LOGITS
    tons
    0.97
     somehow
    0.85
     alas
    0.82
     strangely
    0.80
    heric
    0.78
    entimes
    0.74
    theless
    0.74
     despite
    0.71
     nonetheless
    0.70
    oner
    0.70
    Act Density 0.015%

    No Known Activations