INDEX
    Explanations

    phrases related to causality or influence

    instances of the word "to," indicating a focus on expressions of purpose or intention

    New Auto-Interp
    Negative Logits
     contrace
    -0.70
     tucked
    -0.70
     handled
    -0.68
    Pixel
    -0.67
     geared
    -0.67
     toured
    -0.67
     todd
    -0.66
     cared
    -0.65
     headlined
    -0.65
     touched
    -0.65
    POSITIVE LOGITS
    icial
    0.82
    ãĥĨãĤ£
    0.73
     extinction
    0.70
    ãĤ´ãĥ³
    0.66
    minist
    0.66
    ãĥĩãĤ£
    0.66
    ym
    0.66
     breakthrough
    0.66
    obin
    0.65
    ournal
    0.65
    Act Density 0.088%

    No Known Activations