INDEX
    Explanations

    pronouns and references to individuals, emphasizing personal agency and actions

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.06
    3:0.13
    4:0.19
    5:0.03
    6:0.14
    7:0.13
    8:0.07
    9:0.03
    10:0.06
    11:0.07
    Negative Logits
     Thumbnails
    -1.89
    thumbnails
    -1.53
     Fantasy
    -1.45
    -1.41
    ══
    -1.40
    Offline
    -1.39
    Enh
    -1.38
    Vector
    -1.38
    ANC
    -1.35
    CLUS
    -1.31
    POSITIVE LOGITS
     bothered
    1.68
     chosen
    1.43
    erous
    1.38
     bother
    1.37
     seeming
    1.30
     bothers
    1.28
     spends
    1.28
     replied
    1.27
     volunteered
    1.27
     heck
    1.25
    Act Density 0.006%

    No Known Activations