INDEX
    Explanations

    mentions of attention or intention in the text

    terms related to attention and intention

    New Auto-Interp
    Negative Logits
     Mehran
    -0.78
    senal
    -0.75
    accompan
    -0.69
     crest
    -0.68
     tribal
    -0.66
     Juliet
    -0.63
     Recon
    -0.62
     shatter
    -0.61
     Mend
    -0.60
     succeeding
    -0.60
    POSITIVE LOGITS
    ention
    1.05
    edly
    0.90
    aldehyde
    0.88
    rary
    0.88
    theless
    0.86
    ertodd
    0.85
    estinal
    0.85
    endment
    0.83
    rontal
    0.83
    ally
    0.83
    Act Density 0.005%

    No Known Activations