INDEX
    Explanations

    instances of contrasting information or unexpected outcomes

    the conjunction "but" to signal contrasting ideas or exceptions

    New Auto-Interp
    Negative Logits
    resa
    -0.61
    velt
    -0.60
    olution
    -0.58
    uphem
    -0.58
    enced
    -0.58
    uto
    -0.58
    oin
    -0.57
    archment
    -0.57
    mop
    -0.56
    nce
    -0.56
    POSITIVE LOGITS
    tons
    1.19
    chery
    1.05
     nevertheless
    0.87
    tered
    0.86
    chers
    0.86
     alas
    0.83
     luckily
    0.82
     nonetheless
    0.82
    ler
    0.80
     fortunately
    0.79
    Act Density 0.100%

    No Known Activations