INDEX
    Explanations

    instances of the word "do" and its variations in different contexts

    New Auto-Interp
    Negative Logits
    doing
    -0.21
    never
    -0.20
    do
    -0.18
    nt
    -0.17
    b
    -0.17
    ni
    -0.17
    gr
    -0.17
    rary
    -0.17
    par
    -0.17
    more
    -0.17
    POSITIVE LOGITS
    zed
    0.24
    led
    0.20
    oming
    0.20
    able
    0.20
    ctype
    0.20
    ctr
    0.19
     recall
    0.19
    zen
    0.19
    (es
    0.19
    xor
    0.18
    Act Density 0.047%

    No Known Activations