INDEX
    Explanations

    attends to the "trend" tokens from various forms of the word "trend" in different contexts

    New Auto-Interp
    Head Attr Weights
    0:0.59
    1:0.01
    2:0.02
    3:0.02
    4:0.22
    5:0.03
    6:0.01
    7:0.05
    Negative Logits
     myſelf
    -0.72
     Efq
    -0.71
     raiſ
    -0.68
     itſelf
    -0.66
     ſche
    -0.66
     poffe
    -0.63
     purpoſe
    -0.63
     greateſt
    -0.62
    ſelves
    -0.62
     fubject
    -0.62
    POSITIVE LOGITS
    </b>
    0.28
    Tembelea
    0.27
    antd
    0.26
     heures
    0.26
    みましょう
    0.25
     g
    0.25
     Me
    0.25
    ziale
    0.25
     t
    0.25
    </i>
    0.25
    Act Density 0.007%

    No Known Activations