INDEX
    Explanations

    instances where a preferred action or choice is made over an alternative

    New Auto-Interp
    Negative Logits
     landmark
    -0.69
     Previously
    -0.64
     milestone
    -0.61
     particularly
    -0.60
    estamp
    -0.60
    anta
    -0.60
     anniversary
    -0.59
     exceeds
    -0.58
    mentioned
    -0.57
     Earlier
    -0.57
    POSITIVE LOGITS
     merely
    1.19
     concentrate
    1.08
     simply
    1.00
    Instead
    0.90
     purely
    0.89
     relying
    0.86
     foc
    0.86
     instead
    0.85
     bland
    0.81
     focus
    0.81
    Act Density 3.978%

    No Known Activations