INDEX
    Explanations

    expressions of preference using the phrase "rather" followed by an action or situation

    expressions of preference and comparison

    New Auto-Interp
    Negative Logits
    orig
    -0.68
    mentioned
    -0.65
    INAL
    -0.65
    idden
    -0.64
    ults
    -0.62
    uss
    -0.62
    bing
    -0.62
     Yard
    -0.61
    Sah
    -0.60
    uum
    -0.60
    POSITIVE LOGITS
     prioritize
    0.81
     emulate
    0.78
     than
    0.78
     settle
    0.77
     lose
    0.76
     spend
    0.76
     avoid
    0.75
     tolerate
    0.75
     stay
    0.74
     survive
    0.73
    Act Density 0.019%

    No Known Activations