INDEX
    Explanations

    pronouns and articles related to existential conditions or perceptions

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.08
    3:0.09
    4:0.20
    5:0.03
    6:0.27
    7:0.06
    8:0.04
    9:0.03
    10:0.05
    11:0.05
    Negative Logits
     footprints
    -1.54
    ּ
    -1.30
     uniforms
    -1.25
    Tam
    -1.21
     DV
    -1.21
     Ys
    -1.20
     playbook
    -1.18
     levers
    -1.16
    Orig
    -1.15
     rosters
    -1.14
    POSITIVE LOGITS
    theless
    1.74
    ouver
    1.62
    iful
    1.56
    ividual
    1.55
    icultural
    1.54
    anwhile
    1.49
     guiName
    1.44
    seless
    1.43
    amaz
    1.43
    entious
    1.38
    Act Density 0.011%

    No Known Activations