INDEX
    Explanations

    instances of the word "look" in various forms

    New Auto-Interp
    Negative Logits
    avec
    -0.15
    les
    -0.15
    <?,
    -0.15
    ément
    -0.15
    ive
    -0.14
    amax
    -0.14
    gent
    -0.14
    uction
    -0.14
    934
    -0.14
    ff
    -0.14
    POSITIVE LOGITS
     closely
    0.28
     carefully
    0.25
     deeper
    0.22
     closer
    0.21
     clos
    0.21
     into
    0.20
     Clo
    0.20
     hard
    0.19
    /list
    0.18
     specifically
    0.18
    Act Density 0.042%

    No Known Activations