INDEX
    Explanations

    email addresses and domain-related patterns

    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.03
    2:0.05
    3:0.13
    4:0.02
    5:0.09
    6:0.05
    7:0.13
    8:0.06
    9:0.04
    10:0.17
    11:0.10
    Negative Logits
     DRAG
    -1.07
     ®
    -1.01
    theless
    -0.93
    -0.92
     moreover
    -0.91
     DRAGON
    -0.86
     CONTROL
    -0.85
        
    -0.84
     CARD
    -0.82
    ographically
    -0.81
    POSITIVE LOGITS
    Twe
    0.91
    __
    0.90
    Politics
    0.90
    ___
    0.87
    Story
    0.86
    omics
    0.86
    DonaldTrump
    0.85
    itbart
    0.85
    Kid
    0.83
    haw
    0.82
    Act Density 0.040%

    No Known Activations