INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    guh
    0.45
    gv
    0.43
    zj
    0.42
    gj
    0.41
    equ
    0.39
    basename
    0.39
    inian
    0.38
    ))$
    0.38
    Regular
    0.38
    0.37
    POSITIVE LOGITS
     stack
    1.00
     Stack
    0.98
     stacked
    0.93
    Stack
    0.87
     stacks
    0.86
    0.86
    stack
    0.84
     stacking
    0.84
    stacks
    0.76
    stacked
    0.74
    Act Density 0.008%

    No Known Activations