INDEX
    Explanations

    phrases related to paying attention or the concept of attentiveness

    New Auto-Interp
    Negative Logits
     Efq
    -1.08
     Monfieur
    -1.00
     Theſe
    -1.00
     itſelf
    -1.00
     myſelf
    -1.00
     raiſ
    -0.99
     Jefus
    -0.95
    ſelf
    -0.94
     againſt
    -0.90
     himſelf
    -0.90
    POSITIVE LOGITS
     attention
    0.97
    ced
    0.89
    han
    0.79
    attention
    0.75
     han
    0.67
     Attention
    0.67
    HAN
    0.66
     atten
    0.61
     att
    0.59
     attentive
    0.59
    Act Density 0.088%

    No Known Activations