INDEX
    Explanations

    phrases that refer to drawing attention or indicating focus

    New Auto-Interp
    Negative Logits
    TY
    -0.17
    ookie
    -0.17
    uckle
    -0.15
    pios
    -0.14
     caracter
    -0.14
    rant
    -0.14
     Rosenstein
    -0.14
    iasi
    -0.14
    shaw
    -0.13
    etag
    -0.13
    POSITIVE LOGITS
     attention
    0.70
    attention
    0.57
     Attention
    0.56
    Attention
    0.52
     внимание
    0.45
    _attention
    0.41
     atención
    0.38
    注æĦı
    0.37
     вним
    0.36
     notice
    0.35
    Act Density 0.065%

    No Known Activations