INDEX
    Explanations

    references to the word "gate"

    references to gates or barriers, indicating a focus on physical or metaphorical entry points

    New Auto-Interp
    Negative Logits
     subp
    -0.73
     constitu
    -0.66
     own
    -0.65
     skilled
    -0.65
     specialized
    -0.64
     blueprint
    -0.62
    inances
    -0.61
    >>>>>>>>
    -0.58
     marrow
    -0.57
    ynt
    -0.57
    POSITIVE LOGITS
    gate
    1.49
    Gate
    1.11
    way
    0.93
    boro
    0.87
    ardless
    0.86
    pole
    0.85
    ways
    0.83
    cloth
    0.83
    math
    0.82
    watch
    0.82
    Act Density 0.008%

    No Known Activations