INDEX
    Explanations

    pronouns and specific references to individuals

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.03
    3:0.05
    4:0.04
    5:0.04
    6:0.47
    7:0.05
    8:0.04
    9:0.06
    10:0.06
    11:0.08
    Negative Logits
    hower
    -1.34
    shire
    -1.33
     pulp
    -1.29
    ATTLE
    -1.15
    advertisement
    -1.13
    MJ
    -1.12
     Violet
    -1.11
    reviewed
    -1.10
     gates
    -1.06
    GREEN
    -1.05
    POSITIVE LOGITS
    EStream
    1.49
    agog
    1.47
    opian
    1.42
    ensitive
    1.37
    ukong
    1.36
    idi
    1.36
    xual
    1.36
    amia
    1.35
    uddin
    1.34
    ovi
    1.30
    Act Density 0.001%

    No Known Activations