INDEX
    Explanations

    where to watch content

    This attention head attends to the start of the text from various points later in the text.

    New Auto-Interp
    Negative Logits
     GPIO
    -0.09
     Yang
    -0.08
     XA
    -0.08
    Ye
    -0.08
    Yang
    -0.08
     Interaction
    -0.08
    -0.08
     aggress
    -0.08
    alala
    -0.08
    Interaction
    -0.08
    POSITIVE LOGITS
    免费在线观看
    0.11
    免费观看
    0.11
    免费播放
    0.10
     licensed
    0.10
     authorized
    0.09
    观看
    0.09
     rental
    0.09
     находится
    0.08
     лиценз
    0.08
     authorised
    0.08
    Act Density 0.036%

    No Known Activations