INDEX
    Explanations

    instances of the phrase "I do" in various contexts

    New Auto-Interp
    Negative Logits
    b
    -0.18
    اÛĮÙĩ
    -0.18
    doing
    -0.17
    ness
    -0.17
    noon
    -0.17
    nt
    -0.17
    ature
    -0.16
    pher
    -0.16
    gr
    -0.16
    sterol
    -0.15
    POSITIVE LOGITS
    zed
    0.22
    ÂŃing
    0.21
    zen
    0.21
    ings
    0.21
    (es
    0.21
    xor
    0.20
    yles
    0.20
    able
    0.20
    ctr
    0.20
    led
    0.19
    Act Density 0.045%

    No Known Activations