INDEX
    Explanations

    instances of the word "in."

    New Auto-Interp
    Negative Logits
    mazon
    -0.16
    ertools
    -0.16
    asm
    -0.15
     accordance
    -0.15
    bac
    -0.15
    rám
    -0.14
    trag
    -0.14
     spite
    -0.14
    contri
    -0.14
    tlement
    -0.14
    POSITIVE LOGITS
     turn
    0.29
    verts
    0.28
     itself
    0.28
    ients
    0.27
     fact
    0.26
    izes
    0.25
    iates
    0.25
     question
    0.25
    ched
    0.24
    -turn
    0.24
    Act Density 0.170%

    No Known Activations