INDEX
    Explanations

    phrases related to taking actions or steps

    repeated mentions of the word "the" indicating a focus on articles

    New Auto-Interp
    Negative Logits
    -+-+
    -0.82
    lished
    -0.81
    tions
    -0.79
    alde
    -0.77
    tion
    -0.77
    lich
    -0.76
    Operation
    -0.76
    ambo
    -0.74
    cade
    -0.73
    ntil
    -0.73
    POSITIVE LOGITS
     brunt
    1.35
     plunge
    1.34
     opportunity
    1.16
     reins
    1.15
     initiative
    1.12
     helm
    1.10
     bait
    1.07
     blame
    1.07
     liberty
    1.04
     cue
    0.95
    Act Density 0.049%

    No Known Activations