INDEX
    Explanations

    instances of the word "walk" and its various forms

    New Auto-Interp
    Negative Logits
    itt
    -0.15
    Äı
    -0.15
    lc
    -0.15
    uels
    -0.15
    ijke
    -0.14
    illet
    -0.14
    _DL
    -0.14
    inges
    -0.14
    edi
    -0.14
    luv
    -0.14
    POSITIVE LOGITS
     walk
    0.32
    walk
    0.29
     Walk
    0.29
    Walk
    0.27
     walks
    0.26
     walked
    0.26
    .walk
    0.24
    _walk
    0.23
     away
    0.23
    æŃ©
    0.22
    Act Density 0.024%

    No Known Activations