INDEX
    Explanations

    phrases indicating alternatives or contrasts

    instances of a contrasting phrase or structure that begins with "Instead."

    New Auto-Interp
    Negative Logits
    Condition
    -0.66
    vez
    -0.62
    SF
    -0.62
    ASED
    -0.61
    ented
    -0.59
    ental
    -0.58
    CLUD
    -0.58
    ENTS
    -0.57
    gin
    -0.57
    AG
    -0.55
    POSITIVE LOGITS
    ples
    0.74
     thereof
    0.72
     opting
    0.71
     of
    0.69
    ortun
    0.68
    ilon
    0.68
    achu
    0.66
    terness
    0.66
    ,.
    0.63
     we
    0.63
    Act Density 0.024%

    No Known Activations