INDEX
    Explanations

    instances of the word "look" and its variations used to direct attention

    New Auto-Interp
    Negative Logits
    uner
    -0.19
    eka
    -0.18
    MENT
    -0.17
    eken
    -0.17
    ekt
    -0.15
    soever
    -0.15
    idor
    -0.15
    uctor
    -0.14
    /by
    -0.14
    ffen
    -0.14
    POSITIVE LOGITS
     closely
    0.21
    sharp
    0.21
     familiar
    0.21
     ma
    0.20
     Sharp
    0.20
     sharp
    0.19
    outs
    0.19
     Fam
    0.19
     Ma
    0.18
    fant
    0.18
    Act Density 0.017%

    No Known Activations