INDEX
    Explanations

    phrases indicating setting objectives or goals

    phrases indicating the intention or purpose of actions

    New Auto-Interp
    Negative Logits
    eries
    -0.65
    ilege
    -0.62
    antha
    -0.61
    ery
    -0.61
     outage
    -0.60
    atching
    -0.60
    avorable
    -0.60
    essee
    -0.59
    interstitial
    -0.59
    leness
    -0.59
    POSITIVE LOGITS
     anew
    0.87
    fitted
    0.85
    posts
    0.73
    gow
    0.68
    llor
    0.65
    tracks
    0.64
    AAF
    0.64
    Goal
    0.64
    ¯¯¯¯¯¯¯¯
    0.64
    ngth
    0.63
    Act Density 0.047%

    No Known Activations