INDEX
    Explanations

    instances of the word "call" and related phrases

    New Auto-Interp
    Negative Logits
    365
    -0.17
    atk
    -0.16
    els
    -0.15
    imo
    -0.15
    ara
    -0.15
    ubo
    -0.15
    ihat
    -0.14
    ycl
    -0.14
    ye
    -0.14
    yt
    -0.14
    POSITIVE LOGITS
     dib
    0.32
     attention
    0.29
     quits
    0.26
     upon
    0.25
    oused
    0.25
    igraphy
    0.24
    ously
    0.23
     Attention
    0.23
     forth
    0.23
    attention
    0.22
    Act Density 0.052%

    No Known Activations