INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    307
    -0.06
     doorstep
    -0.06
     baz
    -0.06
     xv
    -0.06
    plement
    -0.06
    /car
    -0.06
    vendor
    -0.06
    better
    -0.06
    coins
    -0.06
    .ds
    -0.06
    POSITIVE LOGITS
    M
    0.13
    m
    0.11
     M
    0.11
     m
    0.11
    .M
    0.11
    [M
    0.08
    ,M
    0.08
    -m
    0.08
    >M
    0.08
     RM
    0.08
    Act Density 0.172%

    No Known Activations