INDEX
    Explanations

    first-person pronouns and related verbs, indicating personal statements and actions

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.01
    2:0.23
    3:0.11
    4:0.22
    5:0.04
    6:0.09
    7:0.04
    8:0.03
    9:0.06
    10:0.05
    11:0.05
    Negative Logits
    untled
    -1.50
     ACTIONS
    -1.44
    acting
    -1.38
    assemb
    -1.31
    ardless
    -1.31
    raf
    -1.28
     unarmed
    -1.27
    posing
    -1.26
    odge
    -1.24
    itting
    -1.23
    POSITIVE LOGITS
    iest
    1.33
     Scrib
    1.28
    1.26
     cents
    1.26
     tru
    1.26
    !.
    1.24
    !",
    1.24
     Cele
    1.23
    "},
    1.22
     favourites
    1.21
    Act Density 0.005%

    No Known Activations