INDEX
    Explanations

    references to "attention."

    New Auto-Interp
    Negative Logits
    jaw
    -0.15
    achten
    -0.15
    (MPI
    -0.15
    GRAY
    -0.15
    oplan
    -0.14
    stk
    -0.14
    arest
    -0.14
    cs
    -0.14
    elan
    -0.14
    ittel
    -0.14
    POSITIVE LOGITS
     attention
    0.21
     Attention
    0.19
    attention
    0.16
    nist
    0.15
    ight
    0.15
    ship
    0.14
     Swift
    0.14
    .globalData
    0.14
    ships
    0.14
    FAST
    0.14
    Act Density 0.029%

    No Known Activations